`include-file` function
A weakness of execution configuration pattern is that it interleaves configuration and code. For example, gatekeeper constraint templates wrap Rego code in KRM. Similarly, starlark code is wrapped in StarlarkRun resource. We can improve the development UX by providing an imperative pre-processor function that injects the content of a file into a particular field of a KRM resource. Note that this is useful for gatekeeper admission controller, even when not using the gatekeeper kpt function.
Strawman:
Assume one or more resources (which may be deeply nested) in the current directory have a comment directive pointing to a file containing code, e.g.:
apiVersion: fn.kpt.dev/v1alpha1
kind: StarlarkRun
metadata:
name: set-namespace-to-prod
# kpt-include: myscript.star
source: |
...
This script references myscript.star in the same directory:
def setnamespace(resources, namespace):
for resource in resources:
# mutate the resource
resource["metadata"]["namespace"] = namespace
setnamespace(ctx.resource_list["items"], "prod")
And we have a gatekeeper template:
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata: # kpt-merge: /k8sbannedconfigmapkeysv1
name: k8sbannedconfigmapkeysv1
spec:
crd:
spec:
names:
kind: K8sBannedConfigMapKeysV1
validation:
openAPIV3Schema:
properties:
keys:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
# kpt-include: myscript.rego
rego: |
...
That references myscript.rego:
package ban_keys
violation[{"msg": sprintf("%v", [val])}] {
keys = {key | input.review.object.data[key]}
banned = {key | input.parameters.keys[_] = key}
overlap = keys & banned
count(overlap) > 0
val := sprintf("The following banned keys are being used in the ConfigMap: %v", [overlap])
}
After running the imperative function:
$ kpt fn eval -i gcr.io/kpt-fn/include-file --mount-current
(See #2351)
All resources containing the kpt-include comment directive are expanded:
apiVersion: fn.kpt.dev/v1alpha1
kind: StarlarkRun
metadata:
name: set-namespace-to-prod
# kpt-include: myscript.star
source: |
# set the namespace on all resources
def setnamespace(resources, namespace):
for resource in resources:
# mutate the resource
resource["metadata"]["namespace"] = namespace
setnamespace(ctx.resource_list["items"], "prod")
apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata: # kpt-merge: /k8sbannedconfigmapkeysv1
name: k8sbannedconfigmapkeysv1
spec:
crd:
spec:
names:
kind: K8sBannedConfigMapKeysV1
validation:
openAPIV3Schema:
properties:
keys:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
# kpt-include: myscript.rego
rego: |-
package ban_keys
violation[{"msg": sprintf("%v", [val])}] {
keys = {key | input.review.object.data[key]}
banned = {key | input.parameters.keys[_] = key}
overlap = keys & banned
count(overlap) > 0
val := sprintf("The following banned keys are being used in the ConfigMap: %v", [overlap])
}
Considerations:
- Potential for abuse of this function. We don't want to encourage its use for maintaining partial resource configuration as a DRY technique (i.e. injecting KRM into KRM). Injecting into a single field of type string mitigates this. Should also consider a more descriptive name.
- Whether to support relative paths or only allow the file in the same directory as the KRM. The latter is fine to start with to keep things simple.
I agree with the injection idea in order to reduce boilerplate and facilitate separation of code and data, and possibly other preprocessing. This could also facilitate testing.
A non-kpt ConstraintTemplate generation example: https://github.com/plexsystems/konstraint/tree/main/examples/container-deny-latest-tag
Possibly another issue would be warranted, but functions are currently too verbose. For example, setting an annotation should only require a couple of lines of code. The current function contains 150. https://github.com/GoogleContainerTools/kpt-functions-catalog/blob/master/functions/go/set-annotations/main.go
Other Go implementations that set annotations are also fairly long, largely because they include a lot of code that could be in a more generic harness: https://github.com/kubernetes-sigs/kustomize/blob/master/plugin/builtin/annotationstransformer/AnnotationsTransformer.go https://github.com/kubernetes-sigs/kustomize/blob/master/api/filters/annotations/annotations.go https://github.com/jenkins-x-plugins/jx-gitops/blob/main/pkg/cmd/annotate/annotate.go https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/kubectl/pkg/cmd/annotate/annotate.go
Interpreted languages especially have the potential for simplification, but even our typescript functions are much longer than is ideal and would benefit from kernel injection or composition: https://github.com/GoogleContainerTools/kpt-functions-catalog/blob/master/functions/ts/kubeval/src/kubeval.ts
Metacontroller did a decent job of facilitating simple transformation kernels: https://github.com/GoogleCloudPlatform/metacontroller/blob/master/examples/catset/sync.js
In particular, transformers and validators would benefit from a simple filter kernel framework, with filtering implemented in the orchestrator, as discussed in #2364.
@karlkfi @droot
Coming from https://github.com/GoogleContainerTools/kpt/issues/1151#issuecomment-877836890 supporting such an include comment directive sounds interesting.
My use cases that could benefit from such an include functionality are as follows:
- Loading a kustomization including JSON patches as input of a kustomize pipeline function.
- Loading a local Helm chart's templates as input of a khelm function.
In both cases the function could define a custom resource kind (e.g. FileBlob) used as wrapper resource for non-KRM contents and the comment directive used to add a file's contents to it.
However creating a separate resource for each required non-KRM file could be very cumbersome. Ideally users should be able to just point to a directory (within a Kptfile?) whose files should be wrapped into FileBlob resources (defining the resource kind within the kpt project) - in that case the include comment directive wouldn't be involved.
The Helm chart use case comes with an additional problem: Helm templates need to be maintained outside the kpt function source directory since they contain *.yaml files that actually aren't yaml which makes the kpt source yaml parser fail otherwise.
Therefore one would either need to be able to let the comment directive refer to files outside the function source directory (which should be avoided due to security reasons) or prevent kpt from parsing yaml files that are included as wrapped KRM resources.
@maxsmythe and @julianKatz and @willbeason -- just FYI, not sure if this is on your radar
Would this function only modify files in-place, or could it redirect the output to a separate directory, leaving the original YAML untouched?
Also, at what granularity could/does this operate? If I point it to a directory tree, will it perform all substitutions in that directory tree?
If I'm not modifying the file in-place, will the structure of the output's dirtree be maintained?
Rather than mounting the current working directory, another approach could be including non-KRM files in the ResourceList by wrapping them with KRM: #3118. That's probably the approach we'll need for porch. Like the FileBlob proprosal above.
Hi, few more cases for this feature.
Include non-yaml files (sh scripts, python scripts, etc)
Tekton allows to add content of the script to yaml file
apiVersion: tekton.dev/v1beta1
kind: TaskRun
metadata:
name: kubectl-run
spec:
taskRef:
name: kubernetes-actions
params:
- name: script
value: |
kubectl get pods
echo "-----------"
kubectl get deploy
This is fine when the script is small, but when it become bigger you may want to move it to a separate file to get IDE features
e.g.
my-script.sh
#!/bin/bash
kubectl get pods
echo "-----------"
kubectl get deploy
Include Markdown files
In order to describe Pipeline in Tekton can be used description field
apiVersion: tekton.dev/v1beta1
kind: Pipeline
metadata:
name: cep0001
namespace: tekton-namespace
spec:
description: |
TBD. include-file @cep0001.md
I would like to store description separately from the pipeline and include it into Pipeline with kpt
cep0001.md
Chaos Scenario CEP0001
Link: TBD
Flow:
1. Get pods
2. Kill master redis pod
3. Get pods
Include YAML files
I can have PipelineRun where my input parameter is YAML as well and I want IDE features to be able to change that yaml separately
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
name: cep0002-run
spec:
params:
- name: chaos-mesh-yaml
value: |
TBD include pod-kill.yaml
pod-kill.yaml
kind: PodChaos
apiVersion: chaos-mesh.org/v1alpha1
metadata:
namespace: target-namespace
name: pod-kill
annotations:
config.kubernetes.io/local-config: "true"
spec:
selector:
namespaces:
- target-namespace
labelSelectors:
app.kubernetes.io/component: master
app.kubernetes.io/name: redis
mode: one
action: pod-kill
gracePeriod: 0
Note: I would like to modify pod-kill.yaml by kpt before including into file
Partial YAMLs included into one YAML
https://github.com/argoproj/argo-workflows/blob/master/examples/handle-large-output-results.yaml
This huge yaml can be split into different parts with the functions where to upsert it
templates:
- name: echo
inputs:
parameters:
- name: index
artifacts:
- name: items
path: /tmp/items
container:
image: stedolan/jq:latest
command: [sh, -c]
# jq is just one way to get the appropriate item.
args: ["cat /tmp/items | jq '.[{{inputs.parameters.index}}]'"]