Skip to content

chore(nix): Move nix integ jobs to ec2 fleets#5461

Merged
dougch merged 39 commits intoaws:mainfrom
dougch:nix_fleet
Aug 25, 2025
Merged

chore(nix): Move nix integ jobs to ec2 fleets#5461
dougch merged 39 commits intoaws:mainfrom
dougch:nix_fleet

Conversation

@dougch
Copy link
Copy Markdown
Contributor

@dougch dougch commented Aug 8, 2025

Release Summary:

Resolved issues:

n/a

Description of changes:

Move the Nix Integration CodeBuild jobs to CodeBuild Ec2 fleets, with custom AMI's (not AL).
This reduces the wall clock runtime from 28 min. to 17 min**. The breakdown per child job is shown in this screenshot for this job

Screenshot 2025-08-13 at 16 55 34

**Caveats: These are 6 large instances (3xc7g.12xl and 3xc7a.16xl), with fully populated /nix stores (warmed cache) that maintain state between runs.

The fleets are active and used in the running of the IntegNix job for this PR.

Call-outs:

Nix Python

This PR removes the old Python3.10 nix packages, used by the original Nix job. Now we're managing python with uv.

Pytest

Pytest can maintain a state file between test runs. This allows it to only run failed tests on subsequent runs, and dramatically speeds up re-run attempts, without impact if everything passes. I've added a single retry to the nix uvinteg shell wrapper to do this with every full integration run.

With this change, we can remove the retry of pytest and mark two specific tests as flaky; delivering an overall speedup. The state file lives in /tmp/$CODEBUILD_BUILD_ID dir, so we're not at risk of failing to run tests from one run to the next.

Nix store

The cache jobs do a nix build for each platform and then save those files to s3. Future jobs download this store, however- with Ec2 fleets, the hosts are re-used as-is, so often this download step is a no-op.

This store does need periodic cleaning with nix store gc to avoid filling up the disk, since we're not discarding these images as we do with Docker.

The weekly cleanup from #5430 has been added to the buildspec.

Custom AMI's

It's Nix installed onto the ec2 marketplace Ubuntu24. This could all be automated away in a future pipeline (punt). The script used is checked in here for future reference.

Merge queues and the head build

For our integration tests, we do the build twice, once with the PR and once with main, creating additional binaries s2nd_head and s2nc_head. For some reason, merge queues and ec2 fleets end up getting git clones that don't have the main branch (depth 1 and no other branches). There is a workaround in this PR to do a fetch and checkout main to avoid this failing.

Testing:

How is this change tested (unit tests, fuzz tests, etc.)? CI, adhoc jobs

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions github-actions bot added the s2n-core team label Aug 8, 2025
@dougch dougch requested review from jmayclin and lrstewart August 14, 2025 00:56
@dougch dougch marked this pull request as ready for review August 14, 2025 00:56
dougch and others added 2 commits August 14, 2025 16:50
Co-authored-by: Lindsay Stewart <stewart.r.lindsay@gmail.com>
@dougch dougch requested a review from lrstewart August 18, 2025 19:01
Copy link
Copy Markdown
Contributor

@jmayclin jmayclin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reduces the wall clock runtime to 17 min**
Can we add the old number too? Otherwise it's difficult to judge how much of an improvement it is.

@dougch dougch added this pull request to the merge queue Aug 20, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 20, 2025
@dougch dougch added this pull request to the merge queue Aug 20, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 20, 2025
@dougch dougch enabled auto-merge August 20, 2025 21:16
@dougch dougch added this pull request to the merge queue Aug 20, 2025
@dougch dougch removed this pull request from the merge queue due to a manual request Aug 20, 2025
@dougch dougch added this pull request to the merge queue Aug 21, 2025
@dougch dougch removed this pull request from the merge queue due to a manual request Aug 21, 2025
@dougch dougch enabled auto-merge August 21, 2025 20:41
@dougch dougch added this pull request to the merge queue Aug 21, 2025
@dougch dougch removed this pull request from the merge queue due to a manual request Aug 21, 2025
@dougch dougch added this pull request to the merge queue Aug 22, 2025
@dougch dougch removed this pull request from the merge queue due to a manual request Aug 22, 2025
@dougch dougch enabled auto-merge August 22, 2025 20:59
@dougch dougch added this pull request to the merge queue Aug 22, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 22, 2025
@dougch dougch enabled auto-merge August 25, 2025 15:50
@dougch dougch added this pull request to the merge queue Aug 25, 2025
Merged via the queue into aws:main with commit b9a6f15 Aug 25, 2025
77 of 79 checks passed
@dougch dougch deleted the nix_fleet branch August 25, 2025 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants