Skip to content

argocd-bootstrap.tf: restore bin/argocd-bootstrap (k8s/reorg)#71926

Draft
snickell wants to merge 1 commit intok8s/reorgfrom
k8s/reorg-argocd-bootstrap-script
Draft

argocd-bootstrap.tf: restore bin/argocd-bootstrap (k8s/reorg)#71926
snickell wants to merge 1 commit intok8s/reorgfrom
k8s/reorg-argocd-bootstrap-script

Conversation

@snickell
Copy link
Copy Markdown
Contributor

@snickell snickell commented Apr 5, 2026

This PR preserves the Ruby-owned Argo bootstrap branch on a side branch so it can be reviewed or revived later without keeping it on k8s/reorg. We rolled k8s/reorg back to the simpler helm_release path for the next real full apply/destroy cycle, but the script branch is still worth keeping as a concrete alternative because it captured one real design idea: bootstrap-only apply, and a destroy-only attempt to refresh Helm state from the latest k8s-gitops chart before final uninstall.

What went wrong in the test run is now clear from the logs. The destroy-side bin/argocd-bootstrap path ran helm upgrade --install before helm uninstall, and that reconcile was not a drop-in operation against a live self-managed Argo install. It failed on two separate fronts. First, the live Argo install was already managing the same resources, and the reconcile hit field-manager ownership conflicts against argocd-controller on argocd-notifications-secret, the repo secrets (repo-code-dot-org, repo-k8s-gitops, repo-kargo-charts), Deployment/argocd-applicationset-controller, Deployment/argocd-repo-server, and StatefulSet/argocd-application-controller. Second, the cluster was dirty from an earlier cleanup mistake, so the reconcile also failed validation on the Argo ingress because the AWS load balancer webhook had no endpoints (no endpoints available for "aws-load-balancer-webhook-service"). The combined result is that the script branch is not merge-ready as-is: its pre-destroy Helm reconcile assumption is too optimistic once Argo has been self-managing the same release.

Technical changes:

  • restore terraform_data.argocd_bootstrap as the Argo bootstrap owner in k8s/tofu/codeai-k8s/cluster-infra-argocd/argocd-bootstrap.tf
  • restore k8s/tofu/codeai-k8s/cluster-infra-argocd/bin/argocd-bootstrap
  • keep the script contract that apply installs/bootstrap only and refuses Helm upgrades of an existing release
  • keep the script contract that destroy is the only mode allowed to attempt a Helm refresh before uninstall
  • preserve this design in a reviewable branch while the main line returns to the simpler helm_release baseline

Copy link
Copy Markdown
Contributor Author

snickell commented Apr 5, 2026

Concrete failure snippet from logs/tofu-2026-04-05T05-14-09-destroy.log:

2026-04-05T05:16:24-1000 [destroy] terraform_data.argocd_bootstrap (local-exec): command failed: helm --kubeconfig ... upgrade argocd .../apps/infra/argocd/chart --install --namespace argocd --create-namespace --wait --timeout 10m --set-string _bootstrap.k8s_gitops_revision=5d7dd07219bcb8eeccabbbba96e698cda0028bed
2026-04-05T05:16:24-1000 [destroy] terraform_data.argocd_bootstrap (local-exec): level=WARN msg="upgrade failed" name=argocd error="conflict occurred while applying object argocd/argocd-notifications-secret /v1, Kind=Secret: Apply failed with 1 conflict: conflict with \"argocd-controller\": .stringData && conflict occurred while applying object argocd/repo-code-dot-org /v1, Kind=Secret: Apply failed with 3 conflicts: conflicts with \"argocd-controller\":
- .stringData.name
- .stringData.type
- .stringData.url && conflict occurred while applying object argocd/repo-k8s-gitops /v1, Kind=Secret: Apply failed with 4 conflicts: conflicts with \"argocd-controller\":
- .stringData.enableLfs
- .stringData.name
- .stringData.type
- .stringData.url && conflict occurred while applying object argocd/repo-kargo-charts /v1, Kind=Secret: Apply failed with 4 conflicts: conflicts with \"argocd-controller\":
- .stringData.enableOCI
- .stringData.name
- .stringData.type
- .stringData.url && conflict occurred while applying object argocd/argocd-applicationset-controller apps/v1, Kind=Deployment: Apply failed with 1 conflict: conflict with \"argocd-controller\": .spec.template.spec.containers[name=\"applicationset-controller\"].env[name=\"NAMESPACE\"].valueFrom.fieldRef && conflict occurred while applying object argocd/argocd-repo-server apps/v1, Kind=Deployment: Apply failed with 4 conflicts: conflicts with \"argocd-controller\":
- .spec.template.spec.containers[name=\"repo-server\"].resources.limits.cpu
- .spec.template.spec.containers[name=\"repo-server\"].resources.requests.cpu
- .spec.template.spec.initContainers[name=\"copyutil\"].resources.limits.cpu
- .spec.template.spec.initContainers[name=\"copyutil\"].resources.requests.cpu && conflict occurred while applying object argocd/argocd-application-controller apps/v1, Kind=StatefulSet: Apply failed with 2 conflicts: conflicts with \"argocd-controller\":
- .spec.template.spec.containers[name=\"application-controller\"].resources.limits.cpu
- .spec.template.spec.containers[name=\"application-controller\"].resources.requests.cpu && Internal error occurred: failed calling webhook \"vingress.elbv2.k8s.aws\": failed to call webhook: Post \"https://aws-load-balancer-webhook-service.kube-system.svc:443/validate-networking-v1-ingress?timeout=10s\": no endpoints available for service \"aws-load-balancer-webhook-service\""
2026-04-05T05:16:24-1000 [destroy] terraform_data.argocd_bootstrap (local-exec): Error: UPGRADE FAILED: conflict occurred while applying object argocd/argocd-notifications-secret /v1, Kind=Secret ... and Internal error occurred: failed calling webhook "vingress.elbv2.k8s.aws": no endpoints available for service "aws-load-balancer-webhook-service"

@snickell snickell changed the title argocd-bootstrap.tf: restore bin/argocd-bootstrap branch for later review argocd-bootstrap.tf: restore bin/argocd-bootstrap (dropped from k8s/reorg) Apr 5, 2026
@snickell snickell changed the title argocd-bootstrap.tf: restore bin/argocd-bootstrap (dropped from k8s/reorg) argocd-bootstrap.tf: restore bin/argocd-bootstrap (k8s/reorg) Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant