When an ArgoCD application is "not working," the fastest path to a fix is to stop guessing and read two status fields in the right order. ArgoCD tracks every application on two independent axes: a sync status (Synced / OutOfSync) that answers "does the cluster match Git?", and a health status (Healthy / Progressing / Degraded / Suspended / Missing / Unknown) that answers "are the resulting resources actually working?" These are orthogonal — an app can be Synced but Degraded, or OutOfSync but Healthy. Decide which axis is wrong first, and you have already narrowed three different problems down to one. This guide is the triage layer; each specific failure links down to its dedicated fix page.
Why a systematic approach beats guessing
Most ArgoCD debugging time is wasted because the operator conflates the two axes — they see "the app is red" and start editing manifests when the real problem is a crash-looping pod, or they restart pods when the real problem is drift in Git. The two axes are computed by different machinery and have different fixes:
- Sync status is the output of a diff: ArgoCD compares the rendered manifests from your Git source against the live objects in the cluster. Any non-ignored field that differs marks the app
OutOfSync(ArgoCD diffing docs). - Health status is the output of health checks run against each live resource — for example, a
DeploymentisHealthyonly when its observed generation matches the desired generation and updated replicas equal desired replicas (ArgoCD health docs).
Because they are computed independently, the first diagnostic question is never "what's broken?" — it is "which axis is wrong?" The triage framework below answers that in two commands.
The triage framework
Step 1 — read both axes with one command
argocd app get <app-name>
argocd app get retrieves the full application detail, including both the sync status and the health status, plus a per-resource breakdown (argocd app get reference). If you suspect ArgoCD is showing a stale view, force it to recompute against the live cluster and re-render the source:
# Re-evaluate live state against the cached target manifests
argocd app get <app-name> --refresh
# Also bust the rendered-manifest cache (Helm/Kustomize re-render)
argocd app get <app-name> --hard-refresh
--refresh updates the application data without touching the cached target manifests; --hard-refresh refreshes both the application data and the target manifest cache (argocd app get reference). Reach for --hard-refresh whenever a Helm chart or Kustomize overlay changed but ArgoCD still shows the old diff.
Step 2 — classify by reading the two fields in order
Read sync status first, then health status. This ordering matters because an OutOfSync app may not have had the failing change applied yet, so its health is describing the old version of the workload.
| Sync status | Health status | What it means | Where to start |
|---|---|---|---|
OutOfSync | any | Cluster does not match Git, and ArgoCD has not (or could not) reconcile it | Is auto-sync on? If yes, a sync is failing — see Step 3 |
Synced | Degraded | Git was applied successfully, but the live resources are not working | Runtime problem, not a GitOps problem — see degraded |
Synced | Progressing | Applied and still rolling out; may settle on its own | Wait, then re-check; if it never settles, treat as Degraded |
Synced | Healthy | Matches Git and working | Nothing to fix in ArgoCD |
The single most useful instinct: Synced + Degraded means stop looking at Git. ArgoCD did its job — it applied the manifests — and the resulting Deployment, Service, or PVC is failing its own health check (ArgoCD health docs). Conversely, OutOfSync means the diff is non-empty, so the next command is a diff.
Step 3 — see exactly what differs
argocd app diff <app-name>
argocd app diff renders the difference between the target (Git) and live state. Lines prefixed - are live state, + are desired state; the command returns exit code 1 when a diff is found, 0 when there is none, and 2 on error — which makes it safe to wire into CI gates (argocd app diff reference). Note that Kubernetes Secrets are excluded from this diff, so a Secret will never be the line you see.
How to read the per-resource health roll-up
Application-level health is the worst health of the application's immediate child resources, ranked Healthy > Suspended > Progressing > Degraded > Missing/Unknown. Critically, a resource's health is calculated from information about that resource itself — it is not inherited from its children (ArgoCD health docs). So a Degraded Deployment will pull the whole app to Degraded, but you still have to open that Deployment's own resource tree (argocd app get <app> --output tree) to find the failing pod. Built-in checks worth memorising:
- Deployment / ReplicaSet / StatefulSet / DaemonSet: healthy when observed generation equals desired generation and updated replicas equal desired replicas.
- Service (LoadBalancer) and Ingress: healthy when
status.loadBalancer.ingressis non-empty with at least one hostname or IP. - PersistentVolumeClaim: healthy when
status.phaseisBound.
Source: ArgoCD health docs.
The three common failures: recognise, then fix
The triage above lands you in one of three buckets. Each has a dedicated page with the full remediation — below is only how to recognise you are in that bucket.
OutOfSync that won't reconcile
Recognise it: argocd app get shows OutOfSync, and argocd app diff shows a persistent, often repeating diff — a field flips back every reconcile cycle. The classic cause is a controller or mutating webhook rewriting the object after apply, or template functions like Helm's randAlphaNum generating fresh data each render, both of which ArgoCD lists as standard drift sources (ArgoCD diffing docs). If selfHeal is enabled, you may also see ArgoCD repeatedly re-syncing the same field, because self-heal triggers when live state deviates from Git (automated sync docs).
Full fix (ignoreDifferences, managed-fields filters, self-heal tuning): OutOfSync troubleshooting.
Sync failed: one or more objects failed to apply
Recognise it: the app stays OutOfSync, but this is an operation failure, not just drift — argocd app get surfaces a failed sync operation with an error such as a schema validation error, an admission-webhook rejection, or a PreSync/Sync hook that failed. Sync phases are strict: if a PreSync hook fails the entire sync stops, and a Sync-phase failure marks the sync as failed (sync phases docs). Sync waves compound this — ArgoCD will not progress to a later wave until the current wave's resources are synced and healthy, so a stuck early wave blocks everything behind it (sync waves docs).
Full fix (reading the operation error, hooks, waves, server-side apply): sync-failed troubleshooting.
Degraded health after a successful sync
Recognise it: argocd app get shows Synced + Degraded. Git matched, the apply succeeded, and a child resource failed its health check — most often a Deployment whose rollout never completes (image pull failure, crash loop, failing readiness probe) so updated replicas never reach desired, or a Service/Ingress whose load balancer never provisions an address (ArgoCD health docs). A genuinely stuck Progressing that never settles is diagnosed the same way and converges here.
Full fix (drilling into the failing resource, probes, events, rollout debugging): degraded troubleshooting.
For the full catalogue of ArgoCD failure pages, see the ArgoCD troubleshooting index.
Prevention and operational principles
- Make
selfHealan explicit decision, not a default. With self-heal enabled, ArgoCD re-syncs when live state deviates from Git, after a default 5-second timeout (automated sync docs). That is excellent for stopping config drift, but it will fight any legitimate out-of-band change and can mask the fact that something keeps mutating your objects. If a field flaps under self-heal, fix the diff source — do not just let it re-apply forever. - Keep pruning deliberate. Automated pruning is disabled by default; ArgoCD will not delete resources that are no longer in Git unless you opt in with
prune: true(automated sync docs). Turning it on withoutPruneLastor wave ordering is a real blast-radius risk on shared clusters. - Encode controller-owned fields once, globally. Recurring false-positive drift (HPA reordering
spec.metrics, controllers rewriting fields) should be handled withignoreDifferencesor managed-fields filters rather than repeated manual syncs (ArgoCD diffing docs). Do it at the right layer: per-app for one-offs, inargocd-cmfor cluster-wide rules. - Order risky rollouts with sync waves. Because ArgoCD blocks later waves until earlier ones are healthy (sync waves docs), putting CRDs and namespaces in early waves prevents the "resource type not found" class of sync failures.
- Gate merges with
argocd app diff --local. Running the diff against local manifests before you commit catches drift and accidental field changes at review time, exactly when they are cheapest to fix (argocd app diff reference).
The deeper operational point: ArgoCD is honestly reporting two different truths about your system, and almost every wasted debugging hour comes from acting on the wrong one. Read sync first, health second, and you turn "the app is red" into a specific, fixable failure in under a minute. When a single application's failure is actually a symptom of a cluster-wide or cross-system problem — a bad admission webhook, an image registry outage, a node-pressure cascade — that is where correlating ArgoCD state against Kubernetes events, CI history, and the originating commit pays off, and where automated, read-only investigation earns its keep.
Sources
- ArgoCD — Resource Health (health states and built-in checks)
- ArgoCD — Diffing / OutOfSync computation and ignoreDifferences
- ArgoCD — argocd app get command reference
- ArgoCD — argocd app diff command reference
- ArgoCD — Automated Sync Policy (selfHeal, prune, defaults)
- ArgoCD — Sync Phases and Waves
By Intellira Engineering. AI-assisted draft, reviewed by the Intellira engineering team; claims cited inline; last verified 2026-06-02.