background-shape
Self-Service Infrastructure with Argo CD ApplicationSets
January 29, 2024 · 6 min read · by Muhammad Amal programming

TL;DR — ApplicationSets turn Argo CD from a per-service tool into a fleet tool. Generators discover services from Git, lists, or clusters, then fan out Applications automatically. The right pattern for platforms with more than ~50 services. Care needed on progressiveSync and template merging.

We close out Platform Engineering month with the layer underneath all the abstractions we’ve been building. The Backstage scaffolder, the golden path templates, and the Score generators all eventually produce Kubernetes manifests that need to land in a cluster. Argo CD is the most common landing zone in 2024. ApplicationSets is what turns it into a self-service story.

If you’ve used Argo CD for fewer than ~20 services, you’ve probably written Application resources by hand. Past about 50, that doesn’t scale. ApplicationSets is the multiplexer.

What ApplicationSets actually does

An Application is the Argo CD unit of work — it points at a Git repo and a target cluster, and Argo CD syncs the manifests there into the cluster. One per service per environment, typically.

An ApplicationSet is a controller that generates Application resources based on a source of truth. The source could be a Git directory (one folder per service), a list of names, a Pull Request, or a list of clusters. When the source changes, the controller adds or removes Applications.

# ApplicationSet 0.4.x — Argo CD 2.9
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: platform-services
  namespace: argocd
spec:
  generators:
    - git:
        repoURL: https://github.com/acme/gitops
        revision: HEAD
        directories:
          - path: apps/*
  template:
    metadata:
      name: '{{path.basename}}'
      labels:
        platform.acme.io/managed: 'true'
    spec:
      project: default
      source:
        repoURL: https://github.com/acme/gitops
        targetRevision: HEAD
        path: '{{path}}'
      destination:
        server: https://kubernetes.default.svc
        namespace: '{{path.basename}}'
      syncPolicy:
        automated:
          prune: true
          selfHeal: true
        syncOptions:
          - CreateNamespace=true

The result: every directory under apps/ in the GitOps repo automatically gets an Application. A developer (or scaffolder template) drops a new directory with deployment.yaml and friends, and Argo CD picks it up within seconds. No editing of the ApplicationSet. No PR to the Argo CD config.

This is the self-service piece.

The four generators that matter

Argo CD 2.9 has more generators than you’ll use. The ones I keep reaching for:

Git Directory — walks a path in a repo and emits one Application per matching directory. Best for “every service has a folder.”

Git File — emits one Application per file matching a pattern. The file’s content is exposed as template vars. Useful when you want metadata alongside the source path.

Cluster — emits one Application per registered cluster. The basis for multi-cluster fan-out.

Matrix — composes two generators by cross-product. The combination “every service × every environment cluster” gives you fleet management in one resource.

Matrix is the one that makes ApplicationSets earn its keep at scale:

apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: fleet-services
spec:
  generators:
    - matrix:
        generators:
          - git:
              repoURL: https://github.com/acme/gitops
              revision: HEAD
              directories:
                - path: apps/*
          - clusters:
              selector:
                matchLabels:
                  environment: prod
  template:
    metadata:
      name: '{{path.basename}}-{{name}}'
    spec:
      destination:
        server: '{{server}}'
        namespace: '{{path.basename}}'
      source:
        repoURL: https://github.com/acme/gitops
        path: '{{path}}'
        targetRevision: HEAD

If you have 50 services and 3 prod regions, that’s 150 Applications generated from one ApplicationSet. The number rises and falls with directories and cluster registrations. The platform team writes the policy once.

The cluster-and-tenant pattern

A pattern that’s become standard in 2024 for multi-tenant platforms:

  • One Argo CD instance per logical environment (dev, staging, prod). Or one Argo CD per region.
  • Tenants (teams) live as namespaces with their own RBAC.
  • Each team has its own GitOps directory.
  • One ApplicationSet per tenant scopes their Applications to their namespace.
apiVersion: argoproj.io/v1alpha1
kind: ApplicationSet
metadata:
  name: tenant-orders
  namespace: argocd
spec:
  generators:
    - git:
        repoURL: https://github.com/acme/gitops-orders
        revision: HEAD
        directories:
          - path: '*'
  template:
    metadata:
      name: 'orders-{{path.basename}}'
    spec:
      project: orders
      destination:
        server: https://kubernetes.default.svc
        namespace: 'orders-{{path.basename}}'
      source:
        repoURL: https://github.com/acme/gitops-orders
        path: '{{path}}'
        targetRevision: HEAD
      syncPolicy:
        automated:
          prune: true
          selfHeal: true

The project: orders reference is doing security work. AppProjects in Argo CD let you whitelist source repos, target namespaces, allowed resource kinds, and clusters. Orders team can deploy to orders-* namespaces and nowhere else.

Progressive sync, carefully

Argo CD 2.6 added ApplicationSet progressive sync. The motivation: when you have 50 Applications generated from one ApplicationSet and you change the template, syncing them all at once is reckless.

spec:
  strategy:
    type: RollingSync
    rollingSync:
      steps:
        - matchExpressions:
            - key: env
              operator: In
              values: [dev]
        - matchExpressions:
            - key: env
              operator: In
              values: [staging]
        - matchExpressions:
            - key: env
              operator: In
              values: [prod]

When the ApplicationSet template changes, Argo CD syncs dev Applications first, then staging, then prod. If a step fails its health checks, the rollout pauses.

In practice this is essential for any ApplicationSet that generates more than a handful of Applications. Without it, you’ll eventually push a bad template change that takes down everything at once.

Wiring it into the platform

The full self-service flow:

  1. Developer hits the Backstage scaffolder, generates a new service.
  2. Scaffolder creates the service repo, plus a PR to the GitOps repo adding apps/orders-api/.
  3. The PR is auto-approved (or human-reviewed depending on env).
  4. ApplicationSet’s Git generator picks up the new directory within seconds.
  5. Argo CD provisions the Application, creates the namespace, syncs manifests.
  6. Backstage Kubernetes plugin sees the new workloads via labels and surfaces them in the catalog.

The platform team’s only ongoing work is maintaining the ApplicationSet definitions and the AppProject RBAC. Every new service flows through without manual ops intervention.

flowchart LR
    Dev[Developer] -->|"scaffolder"| BG[Backstage]
    BG -->|"PR"| GO[GitOps Repo]
    GO -->|"webhook + poll"| AS[ApplicationSet]
    AS -->|"creates"| AP[Application]
    AP -->|"sync"| K8s[Kubernetes Cluster]
    K8s -->|"labels"| BG

That loop closes the self-service story.

Common Pitfalls

  • Skipping the AppProject step. A namespace-scoped ApplicationSet without a tight AppProject means a team could theoretically deploy across cluster boundaries. The default project is too permissive.
  • Wildcards in the source repo. repoURL: '*' in an AppProject is convenient and dangerous. Pin to specific repos.
  • Sync-without-prune in production. If you take prune: false for safety, orphaned resources will accumulate. Better to use prune with a confirmation hook for sensitive resources.
  • automated.selfHeal: true on resources you manually patch. Argo CD will revert your manual fix. If you need to ops-patch live, temporarily disable selfHeal or, better, fix the Git source.
  • Trusting Git generators with no path filter. If your generator catches apps/.github/ or apps/README.md, it’ll try to make an Application out of nothing. Always filter the directory glob.
  • No progressive sync on production ApplicationSets. A bad template change will cascade. Always wire RollingSync for anything generating more than 10 Applications.

The pitfall I cost a team a Saturday on: an ApplicationSet template included automated.selfHeal: true, and an engineer was manually editing a ConfigMap during an incident to debug. Argo kept reverting. Eventually we figured it out, but the wasted hours were real. selfHeal on, manual edits off. Make sure your team knows.

Wrapping Up

ApplicationSets is what makes Argo CD a platform tool rather than a team tool. Combined with the rest of the month’s stack — Backstage for the portal, golden path templates, Crossplane or Terraform for infra, Score for workload specs — you have a working IDP. The pieces aren’t novel individually. The composition is what’s mature in 2024.

For deeper reading on the generators and strategies, the Argo CD ApplicationSets docs cover every option. This wraps up Month 1’s platform engineering arc. February shifts to RAG and AI applications — different stack, similar discipline of building real products out of new primitives.