Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Executors Native SSBC#50079

Merged
Piszmog merged 536 commits into
mainfrom
rc/native-ssbc
Apr 26, 2023
Merged

Executors Native SSBC#50079
Piszmog merged 536 commits into
mainfrom
rc/native-ssbc

Conversation

@Piszmog

@Piszmog Piszmog commented Mar 28, 2023

Copy link
Copy Markdown
Contributor

Closes #44183.

This PR updates the batcheshelper image (Native SSBC) to be feature parity with src. This will allow us to remove the dependency executors have on src.

Changes

  • Added a Native SSBC Doc page
    • Linked as a Batch Change requirement for K8s runtime
  • Added a README.md to the batcheshelper
  • Changes to pre mode
  • Changes to post mode
  • Updated command.go to check the logs if the step is skipped
    • This bubbles up to avoid running the actual step and the post step
  • Added an Index to the step to understand what "group" of commands belong together
    • Helps with skipping
  • Updated batchSpecWorkspaceStepV2Resolver to handle env resolution by parsing the logs and adding a stepInfo to it

Created Issues

Test plan

Added unit tests.

Scenarios

Skipped Steps

Hardcoded
Batch spec
name: conditional-hardcoded-test
description: Conditional hardcoded test
on:
  - repositoriesMatchingQuery: file:README.md repo:^github.com/sourcegraph-testing/zap$
steps:
  - run: echo Hello1
    container: alpine:latest
  - if: false
    run: echo Hello2
    container: alpine:latest
  - run: echo Hello3
    container: alpine:latest
changesetTemplate:
  title: Hello from a file
  body: Updated using a file
  branch: file-mount-test
  commit:
    message: Append Hello World to all README.md files
Docker Clip
Screen.Recording.2023-04-13.at.09.22.58.mov
K8s Clip
Screen.Recording.2023-04-13.at.10.06.20.mov
Dynamic
Batch spec
name: conditional-dynamic-test
description: Conditional dynamic test
on:
  - repositoriesMatchingQuery: file:README.md repo:^github.com/sourcegraph-testing/zap$
steps:
  - run: echo Hello1
    container: alpine:latest
  - if: ${{ eq "hello" previous_step.stdout }}
    run: echo Hello2
    container: alpine:latest
  - run: echo Hello3
    container: alpine:latest
changesetTemplate:
  title: Hello from a file
  body: Updated using a file
  branch: file-mount-test
  commit:
    message: Append Hello World to all README.md files
Docker Clip
Screen.Recording.2023-04-13.at.09.23.30.mov
K8s Clip
Screen.Recording.2023-04-13.at.10.22.00.mov

Workspace Files

File
Batch spec
name: python-test
description: Use a python script
on:
  - repositoriesMatchingQuery: file:README.md repo:^github.com/sourcegraph-testing/zap$
steps:
  - run: python /tmp/updater.py
    container: python:latest
    mount:
      - path: ./updater.py
        mountpoint: /tmp/updater.py
changesetTemplate:
  title: Hello from Python
  body: Updated using a Python Script
  branch: python-script-test
  commit:
    message: Append Hello World to all README.md files
Docker Clip
Screen.Recording.2023-04-13.at.09.32.01.mov
K8s Clip
Screen.Recording.2023-04-13.at.10.24.45.mov
Directory
Batch spec
name: directory-test
description: Use a python script in a directory
on:
  - repositoriesMatchingQuery: file:README.md repo:^github.com/sourcegraph-testing/zap$
steps:
  - run: python /tmp/scripts/updater.py
    container: python:latest
    mount:
      - path: ./scripts
        mountpoint: /tmp/scripts
changesetTemplate:
  title: Hello from Python
  body: Updated using a Python Script
  branch: directory-script-test
  commit:
    message: Append Hello World to all README.md files
Docker Clip
Screen.Recording.2023-04-13.at.09.33.24.mov
K8s Clip
Screen.Recording.2023-04-13.at.10.25.50.mov

File Mounts

Batch spec
name: file-mount-test
description: Mount a file
on:
  - repositoriesMatchingQuery: file:README.md repo:^github.com/sourcegraph-testing/zap$
steps:
  - run: /tmp/my-script.sh
    container: alpine:latest
    files:
      "/tmp/my-script.sh": |
        #!/bin/sh
        IFS=$'\n'; echo Hello World | tee -a $(find -name README.md)
changesetTemplate:
  title: Hello from a file
  body: Updated using a file
  branch: file-mount-test
  commit:
    message: Append Hello World to all README.md files
Docker Clip
Screen.Recording.2023-04-13.at.09.48.05.mov
K8s Clip
Screen.Recording.2023-04-13.at.10.26.56.mov

Env Var Rendering

Batch spec
name: output-env-test
description: Output ENV test
on:
  - repositoriesMatchingQuery: file:README.md repo:^github.com/sourcegraph-testing/zap$
steps:
  - run: echo $MESSAGE1
    container: alpine:latest
    env:
      - MESSAGE1: hello 3
      - PASSWORD: mypass
      - MY_SUPER_PASS
  - run: echo $MESSAGE2
    container: alpine:latest
    env:
      MESSAGE2: ${{ previous_step.stdout }}
changesetTemplate:
  title: Hello from a file
  body: Updated using a file
  branch: file-mount-test
  commit:
    message: Append Hello World to all README.md files
Docker Clip
Screen.Recording.2023-04-13.at.09.51.06.mov
K8s Clip
Screen.Recording.2023-04-13.at.10.28.05.mov

Caching

Batch spec
name: python-test
description: Use a python script
on:
  - repositoriesMatchingQuery: file:README.md repo:^github.com/sourcegraph-testing/zap$
steps:
  - run: python /tmp/updater.py
    container: python:latest
    mount:
      - path: ./updater.py
        mountpoint: /tmp/updater.py
changesetTemplate:
  title: Hello from Python
  body: Updated using a Python Script
  branch: python-script-test
  commit:
    message: Append Hello World to all README.md files
Docker Clip
Screen.Recording.2023-04-13.at.09.54.21.mov
K8s Clip
Screen.Recording.2023-04-13.at.10.29.25.mov

Piszmog added 30 commits March 22, 2023 14:51
# Conflicts:
#	enterprise/cmd/executor/internal/worker/runtime/docker.go
#	enterprise/cmd/executor/internal/worker/runtime/runtime.go
# Conflicts:
#	doc/admin/executors/deploy_executors_kubernetes.md
#	enterprise/cmd/executor/internal/worker/runtime/runtime.go
Comment thread enterprise/cmd/executor/internal/worker/runtime/docker_test.go Outdated
Comment thread enterprise/cmd/executor/internal/worker/runtime/firecracker_test.go Outdated
Comment on lines +143 to +147
skipped, err := batcheslib.SkippedStepsForRepo(batchSpec.Spec, string(repo.Name), workspace.FileMatches)
if err != nil {
return apiclient.Job{}, err
}
executionInput.SkippedSteps = skipped

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this simply uses a library method, would it be easier to just let the batcheshelper recompute this vs putting it in the payload? once we go with dynamic skipping, too, we have to recompute it anyways on the client side.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this is quite the lift to move into batcheshelper. In order for batcheshelper to call this lib function, it needs

  • BatchSpecWorkspace
  • Repo name (I believe batcheshelper gets this in the WorkspacesExecutionInput)
  • BatchSpec

We could update the WorkspacesExecutionInput to have these fields (then have transform.go put these fields in it since it already has all the data), but that's quite a change from how it looks today.

Also, SkippedSteps is static. So it only needs to be computed once. It wont change between steps.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh I see 🤔 those are not required for dynamic step skipping?

@Piszmog Piszmog Apr 19, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. This is just pure static skipping. For dynamic skipping, we use the stepContext with the if to determine if the step should be skipped - this is eval in batcheshelper

stepContext is based on the previous result and WorkspacesExecutionInput.

Comment thread enterprise/cmd/executor/internal/worker/runtime/kubernetes_test.go Outdated
Comment thread enterprise/cmd/executor/internal/worker/runner/skip.go Outdated
Comment thread enterprise/cmd/executor/internal/worker/runner/runner.go Outdated
if err := runtimeRunner.Run(ctx, spec); err != nil {
return errors.Wrapf(err, "running command %q", spec.CommandSpec.Key)
}
if strings.HasSuffix(spec.CommandSpec.Key, "pre") {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this again is coupling the batcheshelper and the executor implementation, can we simply skip to the read next step name from the skipfile and jump there? A bit scared that the "pre" word is in here - also because we already talked about that we might be able to consolidate pre and post into one and .. might rename it then :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I am trying to solve with the pre check is I only want to check for a skip file when a pre step has been ran. I don't want to wait the cycles checking for the skip file on every step - especially when we are not doing a batches job.

So the following is the flow (for batches),

  1. Run the step (step.docker.step.0.pre)
  2. Check for skip file
  3. No skip file
  4. Run step (step.docker.step.0.run)
  5. Run step (step.docker.step.0.post)
  6. Run step (step.docker.step.1.pre)
  7. Check for skip file
  8. There is a skip file - get the "next" key (step.docker.step.3.pre)
  9. Check the next step (step.docker.step.1.run) against the skip key - does not match
  10. Check next step (step.docker.step.1.post) against the skip key - does not match
  11. Check next step (step.docker.step.2.pre) against the skip key - does not match
  12. Check next step (step.docker.step.2.run) against the skip key - does not match
  13. Check next step (step.docker.step.2.post) against the skip key - does not match
  14. Check next step (step.docker.step.3.pre) against the skip key - does match
  15. Run step (step.docker.step.3.pre)
  16. Check for skip file
  17. No skip file
  18. Rinse and repeat

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really think of a nice way to not decouple batcheshelper and executor. The skip file is contextual to batcheshelper. I could at least move this check to a lib so all this pre, post, run stuff are all consolidated in a single place versus scatter all over the code base

Comment thread enterprise/cmd/batcheshelper/main.go
Comment thread enterprise/cmd/batcheshelper/run/pre.go Outdated

@eseliger eseliger left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. I think I'm fairly comfortable with the state of this. I'd say let's just get this out and see what works and what doesn't and get integration tests written against it :) We can always follow up more and fixup things, or reduce the coupling if we get better ideas.

)

// CommandKey returns the fully formatted key for the command.
func CommandKey(name Name, rawStepKey string, index int) string {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we prevent jumping back? Otherwise, it might mean that one can create a loop.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me more on what you mean by jumping back?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

step0
step1
step2

are meant to be executed sequentially - if step1 now says "go to step 0", should we allow that? That could cause loops

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should allow that. However going based on the Key value, there isn't a way to know if we are going backwards without parsing out the index from the Key string.

But we wont run into loops since we are not implementing a "go to" system, but a "skip to" system.

So say step1 says "go to step0", it will skip step2 (since it is not step0) and the Job will complete.

Comment on lines +143 to +147
skipped, err := batcheslib.SkippedStepsForRepo(batchSpec.Spec, string(repo.Name), workspace.FileMatches)
if err != nil {
return apiclient.Job{}, err
}
executionInput.SkippedSteps = skipped

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh I see 🤔 those are not required for dynamic step skipping?

@sanderginn sanderginn left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another high quality PR 🙌

  • A lot of unit tests write to or read from disk. It seems to me that disk operations should be mocked out, but I dunno if that's a feasible thing.
  • There's a bunch of unnecessary line breaks in the markdown files, probably from your IDE formatter. Not sure if those are still rendered incorrectly but probably worth fixing.

Comment on lines +13 to +14
Native Server-Side Batch Changes is an image that runs Batch Changes without
requiring [`src-cli`](https://github.com/sourcegraph/src-cli) to be installed on the Executor machine.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's include a short motivation for why a customer would benefit from this

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should even brand this as "native execution", or simply call it the "new execution infrastructure" or execution model or something. This feels like unnecessary terms for a customer to learn, especially if the migration works well, 99.9% of users should never worry about it. Just my 2ct though.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like Native Execution

Comment on lines +34 to +35
The Docker Images that execute the actual Batch Change step require `tee` to be available on the image. Without `tee`,
the output of the step cannot be captured properly for template variable rendering.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it feasible to validate that tee is present on the image somewhere? Maybe in run.Pre()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #51049 to follow up on this.

Comment on lines +69 to +80
## Build

When running on Kubernetes, the image can be built with the following command.

```shell
IMAGE=sourcegraph/batcheshelper:insiders ./build.sh
```

When running the following `sg` commands, `batcheshelper` will automatically be built on any changes to the binary.

- `batcheshelper-builder`
- `batches`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it need to be built manually?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't yet have a profile/command yet in the sg run. So any changes to the source code wont automatically build the images.

Comment on lines +27 to +30
// KubernetesMountDir is the path where the Kubernetes volume is mounted in the container.
KubernetesMountDir = "/data"
// KubernetesMountPath is the path where the Kubernetes volume is mounted in the container.
KubernetesMountPath = "/data/"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are named in a confusing way: I would reflect the mountPath and subPath names in these constants, especially because the KubernetesMountDir is now used as the MountPath in the volume mount.

Comment on lines +225 to +226
MountPath: KubernetesMountDir,
SubPath: strings.TrimPrefix(path, KubernetesMountPath),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at what you're doing here, I'm assuming that path contains the workspace (and maybe step) directory, and the files that the job should have access to live in the /data/ directory in this location?

I would suggest to maybe rename KubernetesMountDir to something like ContainerMountPath. KubernetesMountPath could be HostVolumeSubPath or something.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do want the Kubernetes prefix on these since we have docker and firecracker in the same package. How about KubernetesMountPath for KubernetesMountDir (since it is in additional places than just the volume mount), and KubernetesVolumeMountSubPath for KubernetesMountPath?

Comment on lines +30 to +32
if err = os.Remove(path); err != nil {
return "", errors.Wrap(err, "removing skip file")
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if os.Remove(path) errors due to something like an I/O failure in the middle of a series of steps? Is the entire job marked as failed?

@Piszmog Piszmog Apr 24, 2023

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, you're right. And it is not end of the world if it can't be removed

Comment on lines +21 to +22
- `miniube image load executor-kubernetes:latest`
- `miniube image load sourcegraph/batcheshelper:insiders`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `miniube image load executor-kubernetes:latest`
- `miniube image load sourcegraph/batcheshelper:insiders`
- `minikube image load executor-kubernetes:latest`
- `minikube image load sourcegraph/batcheshelper:insiders`

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⌨️ 🔨

si.ExitCode = entry.ExitCode
si.FinishedAt = entry.StartTime.Add(time.Duration(*entry.DurationMs) * time.Millisecond)
// SetSafeFunc is a function that can be used to set a value on a StepInfo.
type SetSafeFunc func(step int, cb func(*StepInfo))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it called a SafeFunc?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.. Seems like it prevents any panics from happening 🤷 - by having checks... ?

I'll rename to just SetFunc

@Piszmog Piszmog merged commit 3eaa7d4 into main Apr 26, 2023
@Piszmog Piszmog deleted the rc/native-ssbc branch April 26, 2023 20:02
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Native SSBC execution

6 participants