- You have
kubectlconfigured pointing to the target Kubernetes cluster. - You have access to a DataBricks cluster and able to generate PAT token. To generate a token, check generate a DataBricks token.
This will deploy the operator in namespace azure-databricks-operator-system. If you want to customise
the namespace, you can either search-replace the namespace, or use kustomise by following the next
section.
- Download the latest release manifests:
wget https://github.com/microsoft/azure-databricks-operator/releases/latest/download/release.zip
unzip release.zip(optional) Configure maximum number of run reconcilers
- Create the
azure-databricks-operator-systemnamespace:
kubectl create namespace azure-databricks-operator-system- Create Kubernetes secrets with values for
DATABRICKS_HOSTandDATABRICKS_TOKEN:
kubectl --namespace azure-databricks-operator-system \
create secret generic dbrickssettings \
--from-literal=DatabricksHost="https://xxxx.azuredatabricks.net" \
--from-literal=DatabricksToken="xxxxx"- Apply the manifests for the Operator and CRDs in
release/config:
kubectl apply -f release/config- Change the
MAX_CONCURRENT_RUN_RECONCILESvalue inconfig/default/manager_image_patch.yamlunder theenvsection with the desired number of reconcilers
- name: MAX_CONCURRENT_RUN_RECONCILES
value: "1"By default
MAX_CONCURRENT_RUN_RECONCILESis set to 1
- Clone the source code:
git clone git@github.com:microsoft/azure-databricks-operator.git-
Edit file
config/default/kustomization.yamlfile to change your preferences -
Use
kustomizeto generate the final manifests and deploy:
kustomize build config/default | kubectl apply -f -- Deploy the CRDs:
kubectl apply -f config/crd/bases- Deploy a sample job, this will create a job in the default namespace:
curl https://raw.githubusercontent.com/microsoft/azure-databricks-operator/master/config/samples/databricks_v1alpha1_djob.yaml | kubectl apply -f -- Check the Job in Kubernetes:
kubectl get djob- Check the job is created successfully in DataBricks.
If you encounter any issue, you can check the log of the operator by pulling it from Kubernetes:
# get the pod name of your operator
kubectl --namespace azure-databricks-operator-system get pods
# pull the logs
kubectl --namespace azure-databricks-operator-system logs -f [name_of_the_operator_pod]To further aid debugging diagnostic metrics are produced by the operator. Please review the metrics page for further information