Today when a step is being ran by the Kubernetes runtime, it can be slow. There are a few possible reasons for this.
- Batch Changes require a minimum of 3 pods per
step (the pre and post modes from the batcheshelper image)
- There is a persistence volume that is mounted
We should investigate if a single pod per step will speed up the processing. We can leverage initContainers to do all the work of a step instead of spreading out across multiple pods using a persistence volume. We could,
Done
- Determine if a single job pod would be better
- From a maintenance, debugging, and performance perspective
- If it is better, update the Kubernetes runtime to create single pods to do all the work
Technical Considerations
- How big should the
emptyDir be for the Job? Can we dynamically determine that?
Prototyping
Below is an example from some initial testing
func newJob1(name string, image string, command []string) *batchv1.Job {
return &batchv1.Job{
ObjectMeta: metav1.ObjectMeta{
Name: name,
},
Spec: batchv1.JobSpec{
Template: corev1.PodTemplateSpec{
Spec: corev1.PodSpec{
RestartPolicy: "Never",
InitContainers: []corev1.Container{
{
Name: "clone-repo",
Image: "alpine/git:latest",
Command: []string{"sh", "-c"},
Args: []string{
"mkdir -p /data/errwrap; " +
"git -C /data/errwrap init; " +
"git -C /data/errwrap remote add origin http://host.docker.internal:3082/.executors/git/github.com/hashicorp/errwrap; " +
"git -C /data/errwrap config --local gc.auto 0; " +
"git -C /data/errwrap -c http.extraHeader=\"Authorization:token-executor my-token\" -c http.extraHeader=X-Sourcegraph-Actor-UID:internal -c protocol.version=1 fetch --progress --no-recurse-submodules origin 33906e66579c3f1f3318ae152bc292e0edf1f1a2; " +
"git -C /data/errwrap checkout --progress --force 33906e66579c3f1f3318ae152bc292e0edf1f1a2;",
},
VolumeMounts: []corev1.VolumeMount{
{
Name: "job-data",
MountPath: "/data",
},
},
},
},
Containers: []corev1.Container{
{
Name: "job-container",
Image: "alpine:latest",
Command: []string{"ls", "-alR", "/data"},
VolumeMounts: []corev1.VolumeMount{
{
Name: "job-data",
MountPath: "/data",
},
},
},
},
Volumes: []corev1.Volume{
{
Name: "job-data",
VolumeSource: corev1.VolumeSource{
EmptyDir: &corev1.EmptyDirVolumeSource{},
},
},
},
},
},
},
}
}
Today when a
stepis being ran by the Kubernetes runtime, it can be slow. There are a few possible reasons for this.step(thepreandpostmodes from thebatcheshelperimage)We should investigate if a single pod per step will speed up the processing. We can leverage
initContainersto do all the work of a step instead of spreading out across multiple pods using a persistence volume. We could,Done
Technical Considerations
emptyDirbe for the Job? Can we dynamically determine that?Prototyping
Below is an example from some initial testing