Operator CRD spec too limited for production use

This issue is a follow up to #446 - after finishing the work on this PR, I realised that my team also needs to inject a `serviceAccountName` property into the worker pods that get launched by the operator. So naturally, I started putting together a PR for this patch too - but it occurred to me that this is a problem that is just going to keep expanding as other developers need more k8s configurations injected into their pods. For example:

* Service Account*
* Pod Affinity*
* Security Policy*
* Image Pull Secrets
* Volumes*
* Container Resources
* Volume Mounts*
* Sidecar containers
* Prometheus endpoints*

_(* denotes requirements my team already requires)_

It occurred to me that we would ultimately just be recreating the [Pod core/v1 spec](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#podspec-v1-core) and embedding it inside the CRD files created, leading to significant maintenance requirements for keeping the CRD up-to-date with the latest API for Pods, not to mention possible bugs introduced through multiple versions of the Pod spec. Plus its not DRY, KISS or any of that stuff - personally, I'd hate having to maintain this.

## Proposed Solution

Any solution to this problem should aim to eliminate the overhead and problems associated with re-writing k8s standard configurations (like Pods/Deployments) while enabling customisation of a CRD which works with the current operator structure (IE: I'm not proposing rewriting the operator). Since the operator does not currently have a build/release pipeline and CRDs aren't currently published in any standardised way, we could use a build process to convert OpenApi V3 references (`$ref`) into a "compiled" yaml template. This isn't a new idea, the developers at Agones [1] have some CRD definitions that are generated directly from their go source code.

An example with psuedocode:

```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: daskworkergroups.kubernetes.dask.org
spec:
  scope: Namespaced
  group: kubernetes.dask.org
  names:
    kind: DaskWorkerGroup
    ...
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            template:
              # TODO: Point this to something that can be resolved somehow
              $ref: 'io.k8s.api.core.v1.PodTemplateSpec'
            status:
              type: object
...
```

After running a build process over using something like [k8s-crd-resolver](https://github.com/elemental-lf/k8s-crd-resolver) you would get something like:

```yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: daskworkergroups.kubernetes.dask.org
spec:
  scope: Namespaced
  group: kubernetes.dask.org
  names:
    kind: DaskWorkerGroup
    ...
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            template:
              description: PodTemplateSpec describes the data a pod should have when created from a template
              properties:
                metadata: ...
                spec:
                  description: PodSpec is a description of a pod.
                  ...
            status:
              type: object
...
```

Note that the output of the second yaml snippet is the real configuration - I put the actual output of `k8s-crd-resolver` in this gist: [https://gist.github.com/samdyzon/2082109b588fef3031b250e3def01feb](https://gist.github.com/samdyzon/2082109b588fef3031b250e3def01feb)

Theoretically this methodology could also be used to build a pseudo-CRD for each Dask resource (scheduler and worker) individually, and the build process could create one or more CRD specifications for `DaskCluster` and `DaskWorkerGroup` without duplicating source code.

After installing the built CRD into a cluster, a developer could then deploy their resources with as much or as little configuration as needed, but with the same API as deploying a standard pod:

```yaml
apiVersion: kubernetes.dask.org/v1
kind: DaskCluster
metadata:
  name: simple-cluster
  namespace: dask
spec:
  template:
    metadata: ...
    spec:
      serviceAccountName: dask-service-account
      containers:
      - name: worker
        image: dask/dask:latest
        args: ...
        ports:
          - ...
        env:
          - name: TEST_ENVIRONMENT
            value: hello-world
```

Finally, once the output CRD spec has everything it that is needed to spin up a pod, it is trivial to include the pod spec in `build_worker_pod_spec` and `build_scheduler_service_spec`:

```python
def build_worker_pod_spec(name, namespace, image, n, template):
    # template refers to the CRD property we generated during build, see gist for more 
    worker_name = f"{name}-worker-{n}"
    return {
        "apiVersion": "v1",
        "kind": "Pod",
        # The example snippet also provided the metadata under template["metadata"]
        # but excluding it is just finding the schema that only has the "spec"...
        "metadata": {
            "name": worker_name,
            "labels": {
                "dask.org/cluster-name": scheduler_name,
                "dask.org/workergroup-name": name,
                "dask.org/component": "worker",
            },
        },
        "spec": template["spec"]
    }
```

Outcomes:

* Developers can provide their own fully specified pod templates
* No need to maintain CRD specification for K8s standard templates
* Current operator architecture still works

Dependencies

* [k8s-crd-resolver](https://github.com/elemental-lf/k8s-crd-resolver) - looks well maintained, comes with API schemas for many versions of K8s, worked well during my 15 minutes of testing to build this issue.

As I mentioned in my PR, my team and I are extremely keen to get this operator into production - our current methodology for managing our Dask workloads is a massive source of technical debt and reliability issues which I want to remove ASAP. With that in mind, I'd be willing to contribute to the development of this process and help design the full workflow. I'm not keen on starting this work without some feedback from the maintainers though, since I'd be working on this out-of-office hours.

What do you guys think? If there is interest, I can put together some more detailed design documentation.

References:

[1] - They use a build script: [export-openapi.sh](https://github.com/googleforgames/agones/blob/main/build/export-openapi.sh) that produces a YAML file with the embedded spec inside it (and parses some comments to add doco etc) - [Generated CRD from build tools](https://github.com/googleforgames/agones/blob/27d100477b88e451cb3b05c685aa7365435e9b92/install/yaml/install.yaml)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Operator CRD spec too limited for production use #447

Proposed Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Operator CRD spec too limited for production use #447

Description

Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions