Skip to content

Latest commit

 

History

History

README.md

Airflow Helm Chart (User Community)


The User-Community Airflow Helm Chart is the standard way to deploy Apache Airflow on Kubernetes with Helm.
Originally created in 2017, it has since helped thousands of companies create production-ready deployments of Airflow on Kubernetes.


Downloads Contributors License Latest Release ArtifactHub

GitHub Stars ArtifactHub Stars

GitHub Discussions GitHub Issues


Project Goals

  1. Ease of Use
  2. Great Documentation
  3. Support for older Airflow Versions
  4. Support for Kubernetes GitOps Tools (like ArgoCD)

Key Features

History

The User-Community Airflow Helm Chart chart has a long history of being the standard way to deploy Apache Airflow on Kubernetes.

Here is a brief overview of the chart's development from 2017 until today:

  • From October 2017 until December 2018, the chart was called kube-airflow and was developed in gsemet/kube-airflow
  • From December 2018 until November 2020, the chart was called stable/airflow and was developed in helm/charts
  • Since November 2020, the chart has been called Airflow Helm Chart (User Community) and is developed in airflow-helm/charts

Please note, this chart is independent from the official chart in the apache/airflow repo, which was forked from Astronomer's proprietary chart in May 2021.


Guides

Frequently Asked Questions

Examples


Airflow Version Support

The following table lists the airflow versions supported by this chart (set the version with airflow.image.tag value).

Chart Version →
Airflow Version ↓
7.0.0 - 7.16.0 8.0.0 - 8.5.3 8.6.0 8.6.1 - 8.7.0 8.7.1 8.8.0 8.9.0+
1.10.X ✔️ ✔️ [1] ✔️️ [1] ✔️️ [1] ✔️️ [1] ✔️️ [1] ✔️️ [1]
2.0.X ✔️ ✔️ ✔️ ✔️️ ✔️️ ✔️️
2.1.X ✔️ ✔️ ✔️ ✔️️ ✔️️ ✔️️
2.2.X ⚠️ [2] ✔️️ ✔️ ✔️️ ✔️️ ✔️️
2.3.X ✔️️ ✔️️ ✔️️ ✔️️
2.4.X ✔️️ ✔️️ ✔️️ ✔️️
2.5.X ✔️️ ✔️️ ✔️️ ✔️️
2.6.X ✔️️ ✔️️ ✔️️
2.7.X ✔️️ ✔️️
2.8.X ✔️️ ✔️️
2.9.X ❌️ ✔️️
2.10.X ❌️ ⚠️ [3]

[1] you must set airflow.legacyCommands = true when using airflow version 1.10.X
[2] the Deferrable Operators & Triggers feature won't work, as there is no airflow triggerer Deployment
[3] airflow version 2.10.1 has a serious issue with git-sync, use 2.10.2 or later

Airflow Executor Support

The following table lists the airflow executors supported by this chart (set by airflow.executor value).

Chart Version →
Airflow Executor ↓
7.X.X 8.X.X
CeleryExecutor ✔️ ✔️
KubernetesExecutor ⚠️[1] ✔️
CeleryKubernetesExecutor ✔️

[1] we encourage you to use chart version 8.X.X, so you can use the airflow.kubernetesPodTemplate.* values (requires airflow 1.10.11+)

Helm Values

The following is a summary of the helm values provided by this chart (see full list in values.yaml file).

click the symbol to expand

airflow.*
Parameter Description Default
airflow.legacyCommands if we use legacy 1.10 airflow commands false
airflow.image.* configs for the airflow container image <see values.yaml>
airflow.executor the airflow executor type to use CeleryExecutor
airflow.fernetKey the fernet encryption key (sets AIRFLOW__CORE__FERNET_KEY) 7T512UXSSmBOkpWimFHIVb8jK6lfmSAvx4mO6Arehnc=
airflow.webserverSecretKey the secret_key for flask (sets AIRFLOW__WEBSERVER__SECRET_KEY) THIS IS UNSAFE!
airflow.config environment variables for airflow configs {}
airflow.users a list of users to create <see values.yaml>
airflow.usersTemplates bash-like templates to be used in airflow.users <see values.yaml>
airflow.usersUpdate if we create a Deployment to perpetually sync airflow.users true
airflow.connections a list airflow connections to create <see values.yaml>
airflow.connectionsTemplates bash-like templates to be used in airflow.connections <see values.yaml>
airflow.connectionsUpdate if we create a Deployment to perpetually sync airflow.connections true
airflow.variables a list airflow variables to create <see values.yaml>
airflow.variablesTemplates bash-like templates to be used in airflow.variables <see values.yaml>
airflow.variablesUpdate if we create a Deployment to perpetually sync airflow.variables true
airflow.pools a list airflow pools to create <see values.yaml>
airflow.poolsUpdate if we create a Deployment to perpetually sync airflow.pools true
airflow.defaultNodeSelector default nodeSelector for airflow Pods (is overridden by pod-specific values) {}
airflow.defaultAffinity default affinity configs for airflow Pods (is overridden by pod-specific values) {}
airflow.defaultTolerations default toleration configs for airflow Pods (is overridden by pod-specific values) []
airflow.defaultTopologySpreadConstraints default topologySpreadConstraints for airflow Pods (is overridden by pod-specific values) []
airflow.defaultSecurityContext default securityContext configs for Pods (is overridden by pod-specific values) {fsGroup: 0}
airflow.defaultContainerSecurityContext default securityContext for Containers in airflow Pods {}
airflow.podAnnotations extra annotations for airflow Pods {}
airflow.extraPipPackages extra pip packages to install in airflow Pods []
airflow.protectedPipPackages pip packages that are protected from upgrade/downgrade by extraPipPackages ["apache-airflow"]
airflow.extraEnv extra environment variables for the airflow Pods []
airflow.extraContainers extra containers for the airflow Pods []
airflow.extraInitContainers extra init-containers for the airflow Pods []
airflow.extraVolumeMounts extra VolumeMounts for the airflow Pods []
airflow.extraVolumes extra Volumes for the airflow Pods []
airflow.clusterDomain kubernetes cluster domain name cluster.local
airflow.initContainers.* airflow init-containers <see values.yaml>
airflow.localSettings.* airflow_local_settings.py <see values.yaml>
airflow.kubernetesPodTemplate.* pod_template.yaml <see values.yaml>
airflow.dbMigrations.* db-migrations Deployment <see values.yaml>
airflow.sync.* Sync Deployments <see values.yaml>

scheduler.*
Parameter Description Default
scheduler.replicas the number of scheduler Pods to run 1
scheduler.resources resource requests/limits for the scheduler Pods {}
scheduler.nodeSelector the nodeSelector configs for the scheduler Pods {}
scheduler.affinity the affinity configs for the scheduler Pods {}
scheduler.tolerations the toleration configs for the scheduler Pods []
scheduler.topologySpreadConstraints the topologySpreadConstraints configs for the scheduler Pods []
scheduler.securityContext the security context for the scheduler Pods {}
scheduler.labels labels for the scheduler Deployment {}
scheduler.podLabels Pod labels for the scheduler Deployment {}
scheduler.annotations annotations for the scheduler Deployment {}
scheduler.podAnnotations Pod annotations for the scheduler Deployment {}
scheduler.safeToEvict if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" true
scheduler.podDisruptionBudget.* configs for the PodDisruptionBudget of the scheduler <see values.yaml>
scheduler.logCleanup.* configs for the log-cleanup sidecar of the scheduler <see values.yaml>
scheduler.numRuns the value of the airflow --num_runs parameter used to run the airflow scheduler -1
scheduler.livenessProbe.* configs for the scheduler Pods' liveness probe <see values.yaml>
scheduler.extraPipPackages extra pip packages to install in the scheduler Pods []
scheduler.extraContainers extra containers for the scheduler Pods []
scheduler.extraInitContainers extra init-containers for the scheduler Pods []
scheduler.extraVolumeMounts extra VolumeMounts for the scheduler Pods []
scheduler.extraVolumes extra Volumes for the scheduler Pods []
web.*
Parameter Description Default
web.webserverConfig.* configs to generate webserver_config.py <see values.yaml>
web.replicas the number of web Pods to run 1
web.resources resource requests/limits for the airflow web pods {}
web.nodeSelector the number of web Pods to run {}
web.affinity the affinity configs for the web Pods {}
web.tolerations the toleration configs for the web Pods []
web.topologySpreadConstraints the topologySpreadConstraints configs for the web Pods []
web.securityContext the security context for the web Pods {}
web.labels labels for the web Deployment {}
web.podLabels Pod labels for the web Deployment {}
web.annotations annotations for the web Deployment {}
web.podAnnotations Pod annotations for the web Deployment {}
web.safeToEvict if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" true
web.podDisruptionBudget.* configs for the PodDisruptionBudget of the web Deployment <see values.yaml>
web.service.* configs for the Service of the web pods <see values.yaml>
web.readinessProbe.* configs for the web Pods' readiness probe <see values.yaml>
web.livenessProbe.* configs for the web Pods' liveness probe <see values.yaml>
web.extraPipPackages extra pip packages to install in the web Pods []
web.extraContainers extra containers for the web Pods []
web.extraInitContainers extra init-containers for the web Pods []
web.extraVolumeMounts extra VolumeMounts for the web Pods []
web.extraVolumes extra Volumes for the web Pods []
workers.*
Parameter Description Default
workers.enabled if the airflow workers StatefulSet should be deployed true
workers.replicas the number of workers Pods to run 1
workers.resources resource requests/limits for the airflow worker Pods {}
workers.nodeSelector the nodeSelector configs for the worker Pods {}
workers.affinity the affinity configs for the worker Pods {}
workers.tolerations the toleration configs for the worker Pods []
workers.topologySpreadConstraints the topologySpreadConstraints configs for the worker Pods []
workers.securityContext the security context for the worker Pods {}
workers.labels labels for the worker StatefulSet {}
workers.podLabels Pod labels for the worker StatefulSet {}
workers.annotations annotations for the worker StatefulSet {}
workers.podAnnotations Pod annotations for the worker StatefulSet {}
workers.safeToEvict if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" true
workers.podDisruptionBudget.* configs for the PodDisruptionBudget of the worker StatefulSet <see values.yaml>
workers.autoscaling.* configs for the HorizontalPodAutoscaler of the worker Pods <see values.yaml>
workers.celery.* configs for the celery worker Pods <see values.yaml>
workers.terminationPeriod how many seconds to wait after SIGTERM before SIGKILL of the celery worker 60
workers.logCleanup.* configs for the log-cleanup sidecar of the worker Pods <see values.yaml>
workers.livenessProbe.* configs for the worker Pods' liveness probe <see values.yaml>
workers.extraPipPackages extra pip packages to install in the worker Pods []
workers.extraContainers extra containers for the worker Pods []
workers.extraInitContainers extra init-containers for the worker Pods []
workers.extraVolumeMounts extra VolumeMounts for the worker Pods []
workers.extraVolumes extra Volumes for the worker Pods []
triggerer.*
Parameter Description Default
triggerer.enabled if the triggerer should be deployed true
triggerer.replicas the number of triggerer Pods to run 1
triggerer.resources resource requests/limits for the airflow triggerer Pods {}
triggerer.nodeSelector the nodeSelector configs for the triggerer Pods {}
triggerer.affinity the affinity configs for the triggerer Pods {}
triggerer.tolerations the toleration configs for the triggerer Pods []
triggerer.topologySpreadConstraints the topologySpreadConstraints configs for the triggerer Pods []
triggerer.securityContext the security context for the triggerer Pods {}
triggerer.labels labels for the triggerer Deployment {}
triggerer.podLabels Pod labels for the triggerer Deployment {}
triggerer.annotations annotations for the triggerer Deployment {}
triggerer.podAnnotations Pod annotations for the triggerer Deployment {}
triggerer.safeToEvict if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" true
triggerer.podDisruptionBudget.* configs for the PodDisruptionBudget of the triggerer Deployment <see values.yaml>
triggerer.capacity maximum number of triggers each triggerer will run at once (sets AIRFLOW__TRIGGERER__DEFAULT_CAPACITY) 1000
triggerer.livenessProbe.* configs for the triggerer Pods' liveness probe <see values.yaml>
triggerer.extraPipPackages extra pip packages to install in the triggerer Pods []
triggerer.extraContainers extra containers for the triggerer Pods []
triggerer.extraInitContainers extra init-containers for the triggerer Pods []
triggerer.extraVolumeMounts extra VolumeMounts for the triggerer Pods []
triggerer.extraVolumes extra Volumes for the triggerer Pods []
flower.*
Parameter Description Default
flower.enabled if the Flower UI should be deployed true
flower.resources resource requests/limits for the flower Pods {}
flower.nodeSelector the nodeSelector configs for the flower Pods {}
flower.affinity the affinity configs for the flower Pods {}
flower.tolerations the toleration configs for the flower Pods []
flower.topologySpreadConstraints the topologySpreadConstraints configs for the flower Pods []
flower.securityContext the security context for the flower Pods {}
flower.labels labels for the flower Deployment {}
flower.podLabels Pod labels for the flower Deployment {}
flower.annotations annotations for the flower Deployment {}
flower.podAnnotations Pod annotations for the flower Deployment {}
flower.safeToEvict if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" true
flower.podDisruptionBudget.* configs for the PodDisruptionBudget of the flower Deployment <see values.yaml>
flower.basicAuthSecret the name of a pre-created secret containing the basic authentication value for flower ""
flower.basicAuthSecretKey the key within flower.basicAuthSecret containing the basic authentication string ""
flower.service.* configs for the Service of the flower Pods <see values.yaml>
flower.extraPipPackages extra pip packages to install in the flower Pods []
flower.extraContainers extra containers for the flower Pods []
flower.extraInitContainers extra init-containers for the flower Pods []
flower.extraVolumeMounts extra VolumeMounts for the flower Pods []
flower.extraVolumes extra Volumes for the flower Pods []
logs.*
Parameter Description Default
logs.path the airflow logs folder /opt/airflow/logs
logs.persistence.* configs for the logs PVC <see values.yaml>
dags.*
Parameter Description Default
dags.path the airflow dags folder /opt/airflow/dags
dags.persistence.* configs for the dags PVC <see values.yaml>
dags.gitSync.* configs for the git-sync sidecar <see values.yaml>
ingress.*
Parameter Description Default
ingress.enabled if we should deploy Ingress resources false
ingress.apiVersion the apiVersion to use for Ingress resources networking.k8s.io/v1
ingress.web.* configs for the Ingress of the web Service <see values.yaml>
ingress.flower.* configs for the Ingress of the flower Service <see values.yaml>
rbac.*
Parameter Description Default
rbac.create if Kubernetes RBAC resources are created true
rbac.events if the created RBAC Role has GET/LIST on Event resources true
rbac.secrets if the created RBAC Role has GET/LIST/WATCH on Secret resources false
serviceAccount.*
Parameter Description Default
serviceAccount.create if a Kubernetes ServiceAccount is created true
serviceAccount.name the name of the ServiceAccount ""
serviceAccount.annotations annotations for the ServiceAccount {}
extraManifests
Parameter Description Default
extraManifests a list of extra Kubernetes manifests that will be deployed alongside the chart []
pgbouncer.*
Parameter Description Default
pgbouncer.enabled if the pgbouncer Deployment is created true
pgbouncer.image.* configs for the pgbouncer container image <see values.yaml>
pgbouncer.resources resource requests/limits for the pgbouncer Pods {}
pgbouncer.nodeSelector the nodeSelector configs for the pgbouncer Pods {}
pgbouncer.affinity the affinity configs for the pgbouncer Pods {}
pgbouncer.tolerations the toleration configs for the pgbouncer Pods []
pgbouncer.topologySpreadConstraints the topologySpreadConstraints configs for the pgbouncer Pods []
pgbouncer.securityContext the security context for the pgbouncer Pods {}
pgbouncer.labels labels for the pgbouncer Deployment {}
pgbouncer.podLabels Pod labels for the pgbouncer Deployment {}
pgbouncer.annotations annotations for the pgbouncer Deployment {}
pgbouncer.podAnnotations Pod annotations for the pgbouncer Deployment {}
pgbouncer.safeToEvict if we add the annotation: "cluster-autoscaler.kubernetes.io/safe-to-evict" = "true" true
pgbouncer.podDisruptionBudget.* configs for the PodDisruptionBudget of the pgbouncer <see values.yaml>
pgbouncer.livenessProbe.* configs for the pgbouncer Pods' liveness probe <see values.yaml>
pgbouncer.startupProbe.* configs for the pgbouncer Pods' startup probe <see values.yaml>
pgbouncer.terminationGracePeriodSeconds the maximum number of seconds to wait for queries upon pod termination, before force killing 120
pgbouncer.authType sets pgbouncer config: auth_type md5
pgbouncer.maxClientConnections sets pgbouncer config: max_client_conn 1000
pgbouncer.poolSize sets pgbouncer config: default_pool_size 20
pgbouncer.logDisconnections sets pgbouncer config: log_disconnections 0
pgbouncer.logConnections sets pgbouncer config: log_connections 0
pgbouncer.statsUsers sets pgbouncer config: stats_users ""
pgbouncer.clientSSL.* ssl configs for: clients -> pgbouncer <see values.yaml>
pgbouncer.serverSSL.* ssl configs for: pgbouncer -> postgres <see values.yaml>
postgresql.*
Parameter Description Default
postgresql.enabled if the stable/postgresql chart is used true
postgresql.image.* configs for the postgres container image <see values.yaml>
postgresql.postgresqlDatabase the postgres database to use airflow
postgresql.postgresqlUsername the postgres user to create postgres
postgresql.postgresqlPassword the postgres user's password airflow
postgresql.existingSecret the name of a pre-created secret containing the postgres password ""
postgresql.existingSecretKey the key within postgresql.passwordSecret containing the password string postgresql-password
postgresql.persistence.* configs for the PVC of postgresql <see values.yaml>
postgresql.master.* configs for the postgres StatefulSet <see values.yaml>
externalDatabase.*
Parameter Description Default
externalDatabase.type the type of external database postgres
externalDatabase.host the host of the external database localhost
externalDatabase.port the port of the external database 5432
externalDatabase.database the database/scheme to use within the the external database airflow
externalDatabase.user the username for the external database airflow
externalDatabase.userSecret the name of a pre-created secret containing the external database user ""
externalDatabase.userSecretKey the key within externalDatabase.userSecret containing the user string postgresql-user
externalDatabase.password the password for the external database ""
externalDatabase.passwordSecret the name of a pre-created secret containing the external database password ""
externalDatabase.passwordSecretKey the key within externalDatabase.passwordSecret containing the password string postgresql-password
externalDatabase.properties extra connection-string properties for the external database ""
redis.*
Parameter Description Default
redis.enabled if the stable/redis chart is used true
redis.image.* configs for the redis container image <see values.yaml>
redis.password the redis password airflow
redis.existingSecret the name of a pre-created secret containing the redis password ""
redis.existingSecretPasswordKey the key within redis.existingSecret containing the password string redis-password
redis.cluster.* configs for redis cluster mode <see values.yaml>
redis.master.* configs for the redis master StatefulSet <see values.yaml>
redis.slave.* configs for the redis slave StatefulSet <see values.yaml>
externalRedis.*
Parameter Description Default
externalRedis.host the host of the external redis localhost
externalRedis.port the port of the external redis 6379
externalRedis.databaseNumber the database number to use within the external redis 1
externalRedis.password the password for the external redis ""
externalRedis.passwordSecret the name of a pre-created secret containing the external redis password ""
externalRedis.passwordSecretKey the key within externalRedis.passwordSecret containing the password string redis-password
externalDatabase.properties extra connection-string properties for the external redis ""
serviceMonitor.*
Parameter Description Default
serviceMonitor.enabled if ServiceMonitor resources should be deployed false
serviceMonitor.selector labels for ServiceMonitor, so that Prometheus can select it { prometheus: "kube-prometheus" }
serviceMonitor.path the ServiceMonitor web endpoint path /admin/metrics
serviceMonitor.interval the ServiceMonitor web endpoint path 30s
prometheusRule.*
Parameter Description Default
prometheusRule.enabled if the PrometheusRule resources should be deployed false
prometheusRule.additionalLabels labels for PrometheusRule, so that Prometheus can select it {}
prometheusRule.groups alerting rules for Prometheus []