Add git sync option and unit tests for the Helm chart#9371
Add git sync option and unit tests for the Helm chart#9371dimberman merged 19 commits intoapache:masterfrom
Conversation
d2162f6 to
51fb039
Compare
There was a problem hiding this comment.
dag sync won't be needed on the webserver if dag serialization is enabled
chart/values.yaml
Outdated
There was a problem hiding this comment.
We should put this under a kubernetesExecutor section in case we need to expand to further k8sexecutor-specific features
There was a problem hiding this comment.
I initially thought the same, but the git sync sidecars might also be used without the KubernetesExecutor ?
There was a problem hiding this comment.
Ok yeah I can see that. would just need to set a relatively fast cycle for pinging/re-pulling the repo s.t. the worker stays in sync with the scheduler/webserver.
There was a problem hiding this comment.
updated the sync interval to a default of 1 minute, have added in a PVC as well
chart/values.yaml
Outdated
There was a problem hiding this comment.
Please change this to the apache project.
There was a problem hiding this comment.
done, updated it to use dags from https://github.com/apache/airflow/tree/v1-10-stable/tests/dags
8b13e51 to
5d60c26
Compare
c317639 to
d2abd4f
Compare
scripts/ci/ci_run_helm_testing.sh
Outdated
There was a problem hiding this comment.
need to change this image
There was a problem hiding this comment.
I created an issue: #9401 - we have a few more "private" images that we depend on, we should replace them with our own copy of those images. For now it can stay like it is .
chart/values.yaml
Outdated
There was a problem hiding this comment.
I will change it later to a variable. But for now iI think it's good. I plan to use the Helm chart not only for unit tests (which are great BTW) but also to run our Kubernetes tests (we are currently using some custom, simple templates for the kubernetes resources - replacing it with the helm chart from master and using images built locally will be a great way to do doggfooding.
There was a problem hiding this comment.
cool, that would be awesome 👍
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
1f54704 to
5b36045
Compare
|
@dimberman @ashb @potiuk - the tests are now passing, can you help in getting this merged please |
potiuk
left a comment
There was a problem hiding this comment.
Great Change! I was thinking - shoudl not we also update the tests in test_kubernetes? we already have git_sync mode for those tests but it is disable - I am happy to help in switching the git_sync tests on. We can do it in a separate change but it actually belongs here.
@dimberman WDYT?
| ## Updating DAGs | ||
|
|
||
| The recommended way to update your DAGs with this chart is to build a new docker image with the latest code (`docker build -t my-company/airflow:8a0da78 .`), push it to an accessible registry (`docker push my-company/airflow:8a0da78`), then update the Airflow pods with that image: | ||
| The recommended way to update your DAGs with this chart is to build a new docker image with the latest DAG code (`docker build -t my-company/airflow:8a0da78 .`), push it to an accessible registry (`docker push my-company/airflow:8a0da78`), then update the Airflow pods with that image: |
There was a problem hiding this comment.
I think we will want to change that description as it is not valid any more and we have "EMBED_DAGS" directive while building the dockerfiles + we will add on-build most likely to add the DAGs on building deppendent image but I will do it separately.
| # Refer values.yaml for details | ||
| ``` | ||
|
|
||
| ## Mounting DAGS using Git-Sync side car without Persistence |
| @@ -1,6 +0,0 @@ | |||
| dependencies: | |||
| @@ -0,0 +1,64 @@ | |||
| # Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
I love the unit tests of helm chart! ❤️
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
|
Where is this chart hosted? Is it related to https://hub.helm.sh/charts/stable/airflow? |
|
@friedhardware, this chart is only available from this repo for now and it is the official one, not related to the current one on Helm.sh |
| {{- if .Values.dags.persistence.enabled }} | ||
| - name: dags | ||
| persistentVolumeClaim: | ||
| claimName: {{ .Release.Name }}-dags |
There was a problem hiding this comment.
If uses dags.persistence.existingClaim, it should be modified to
claimName: {{ .Values.dags.persistence.existingClaim }}.
Currently, the webserver pod cannot find pvc, so it is put in a pending state.
|
I got an issue yesterday with git-sync and the K8s worker pods. Nonetheless, git-sync along with the webserver and scheduler of Airflow works since the env var AIRFLOW__KUBERNETES__RUN_AS_USER="50000" isn't set which is not the case in the K8S worker pods. To fix this, we have to define Also, we have to set the UID to 50000 otherwise Airflow fails with the following error: |
yes it might be failing as the user 50000 doesn't exist in |
* add git sync sidecars * add a helm test * add more tests * allow users to provide git username and pass via a k8s secrets * set default values for airflow worker repository & tag * change ci timeout * fix link * add credentials_secret to airflow.cfg configmap * set GIT_SYNC_ADD_USER on kubernetes worker pods, set uid * add fsGroup to webserver and kubernete workers * move gitSync to dags.gitSync * rename valueFields * turn off git sync and dag persistence by default * provide option to specify known_hosts * add git-sync details into the chart documentation * Update .gitignore Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> * make git sync max failures configurable * Apply suggestions from code review Co-authored-by: Jarek Potiuk <jarek@potiuk.com> * add back requirements.lock Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> Co-authored-by: Jarek Potiuk <jarek@potiuk.com> (cherry picked from commit d93555b)
* add git sync sidecars * add a helm test * add more tests * allow users to provide git username and pass via a k8s secrets * set default values for airflow worker repository & tag * change ci timeout * fix link * add credentials_secret to airflow.cfg configmap * set GIT_SYNC_ADD_USER on kubernetes worker pods, set uid * add fsGroup to webserver and kubernete workers * move gitSync to dags.gitSync * rename valueFields * turn off git sync and dag persistence by default * provide option to specify known_hosts * add git-sync details into the chart documentation * Update .gitignore Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> * make git sync max failures configurable * Apply suggestions from code review Co-authored-by: Jarek Potiuk <jarek@potiuk.com> * add back requirements.lock Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> Co-authored-by: Jarek Potiuk <jarek@potiuk.com> (cherry picked from commit d93555b)
* add git sync sidecars * add a helm test * add more tests * allow users to provide git username and pass via a k8s secrets * set default values for airflow worker repository & tag * change ci timeout * fix link * add credentials_secret to airflow.cfg configmap * set GIT_SYNC_ADD_USER on kubernetes worker pods, set uid * add fsGroup to webserver and kubernete workers * move gitSync to dags.gitSync * rename valueFields * turn off git sync and dag persistence by default * provide option to specify known_hosts * add git-sync details into the chart documentation * Update .gitignore Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> * make git sync max failures configurable * Apply suggestions from code review Co-authored-by: Jarek Potiuk <jarek@potiuk.com> * add back requirements.lock Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com> Co-authored-by: Jarek Potiuk <jarek@potiuk.com> (cherry picked from commit d93555b)
Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.
Improve the helm chart with:
known_hostswhen using git sync over ssh with an ssh keyDockerfilefrom a forked repohttps://github.com/aneesh-joseph/helm-unittest.