Skip to content

ci(local-env): per-runner port isolation for concurrent Local Environment Tests#1629

Merged
justinfrevert merged 1 commit into
mainfrom
ci/local-env-per-runner-ports
Jun 2, 2026
Merged

ci(local-env): per-runner port isolation for concurrent Local Environment Tests#1629
justinfrevert merged 1 commit into
mainfrom
ci/local-env-per-runner-ports

Conversation

@scottbuckel

Copy link
Copy Markdown
Contributor

Summary

Lets Local Environment Tests run concurrently on the shared self-hosted host instead of being serialized repo-wide. Refs shieldedtech/shielded-sre#281.

The host runs up to 8 runner slots against one Docker daemon. The local-env stack published fixed host ports (9933–9944, 30333–30337, 1337/1442/5432/8088/4222/30000/32000) and used fixed container_names, so two jobs on the same host collided. The job was therefore gated by a fixed concurrency group, throttling every PR.

This gives each runner slot a disjoint host-port block + unique compose project name + container-name suffix, all derived from a single LOCALENV_RUNNER_SLOT parsed from $RUNNER_NAME (e.g. fsn1-runner-01-runner-3 → slot 3). Slot 0 (the default, and any non-self-hosted runner) reproduces the legacy single-tenant layout byte-for-byte, so local dev and existing tooling are unchanged.

local-environment/src/lib/ports.ts is the source of truth (BASE_PORT=21000, BLOCK=64, 22-entry ordered PORT_SPEC). Container-internal ports never change — only host bindings shift — so intra-stack DNS/P2P is untouched. Compose project naming isolates networks and named volumes automatically.

Changes

  • ports.ts — slot → {projectName, nameSuffix, hostPorts}
  • run.ts / stop.ts — inject the slot layout into the compose env (bring-up + teardown target the same project)
  • discoverValidators.ts — resolve ${VAR:-default} host ports from the env (incl. fixing a :-in-${...} split bug)
  • docker-compose.yml — every published host port parameterized; every container_name suffixed
  • tests/e2e/src/config.rs, check-health.sh, toolkit-multi-dest-e2e.sh — honor the slot's ports (legacy fallbacks)
  • Earthfile — thread LOCALENV_RUNNER_SLOT into the LOCALLY bring-up; pass node/ogmios host ports into the hermetic local-env-e2e target
  • action.yml — compute slot + ports (bash mirror of ports.ts); drop the obsolete shared-host teardown workaround
  • continuous-integration.ymlremove the repo-wide serialization gate

Testing

Static validation only — full integration is CI-only (needs node images + the self-hosted host + earthly):

  • tsc --noEmit, eslint, prettier --check: clean
  • docker compose config: slot 0 → legacy names/ports (midnight-node-1, 9933); slot 3 → isolated (midnight-node-1-r3, 21136, project/volumes/network local-env-r3_*)
  • ts-node harness: ports.ts slots 1–8 disjoint + in-range; discoverValidators resolves env ports with legacy fallback
  • bash port formula in action.yml cross-checked identical to ports.ts (slots 1/3/8)
  • shellcheck on both scripts: clean
  • cargo check -p midnight-node-e2e --features local-ci --tests: clean

Needs CI to prove: two PRs actually running Local Environment Tests concurrently without collision, and the e2e step connecting to the right slot's ports.

Notes / follow-ups

  • Passing E2E_*_PORT as earthly build-args means the local-env-e2e cargo layer is cache-keyed per slot (up to 8 variants). Acceptable; can be optimized later if cache pressure shows.
  • The governance-runtime-upgrade command's default --rpc-url (9944) is not slot-aware — out of scope (not in the tests job); pass --rpc-url explicitly if running it against a slotted stack.
  • The per-slot DOCKER_CONFIG isolation step is intentionally kept — it guards the shared-~/.docker login race and is still required under concurrency.

🤖 Generated with Claude Code

@scottbuckel scottbuckel marked this pull request as ready for review June 2, 2026 15:20
@scottbuckel scottbuckel requested a review from a team as a code owner June 2, 2026 15:20
@scottbuckel scottbuckel force-pushed the ci/local-env-per-runner-ports branch from bb69b58 to d760399 Compare June 2, 2026 15:23

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bb69b58bab

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread .github/actions/local-environment-tests/action.yml
…ment Tests

The self-hosted CI host runs up to 8 runner slots against one shared Docker
daemon. The local-env stack published fixed host ports and used fixed container
names, so concurrent jobs collided ("port is already allocated" / name
conflict) and the job was serialized repo-wide with a fixed concurrency group,
throttling every PR.

Give each runner slot a disjoint host-port block, a unique compose project
name, and a container-name suffix, all derived from a single
LOCALENV_RUNNER_SLOT (parsed from $RUNNER_NAME). Slot 0 (the default, and any
non-self-hosted runner) reproduces the legacy single-tenant layout exactly, so
local developer workflows are unchanged.

- local-environment/src/lib/ports.ts: source-of-truth port/layout computation
- run.ts / stop.ts: inject the slot's project name + host ports into compose
- discoverValidators.ts: resolve ${VAR:-default} host ports from the env
- docker-compose.yml: parameterize every published host port; suffix every
  container_name (container-internal ports unchanged)
- tests/e2e/src/config.rs, check-health.sh, toolkit-multi-dest-e2e.sh: honor
  the slot's host ports (legacy defaults preserved)
- Earthfile: thread LOCALENV_RUNNER_SLOT into the LOCALLY bring-up and pass the
  node/ogmios host ports into the hermetic local-env-e2e target
- action.yml: compute the slot + ports (bash mirror of ports.ts); drop the now
  obsolete shared-host teardown workaround
- continuous-integration.yml: remove the repo-wide serialization gate

Refs shieldedtech/shielded-sre#281

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Scott Buckel <scott.buckel@shielded.io>
@scottbuckel scottbuckel force-pushed the ci/local-env-per-runner-ports branch from d760399 to 655e591 Compare June 2, 2026 15:26

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 655e5919c9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread .github/workflows/continuous-integration.yml
@justinfrevert justinfrevert added this pull request to the merge queue Jun 2, 2026
Merged via the queue into main with commit 9221040 Jun 2, 2026
60 checks passed
@justinfrevert justinfrevert deleted the ci/local-env-per-runner-ports branch June 2, 2026 23:35
gilescope added a commit that referenced this pull request Jun 3, 2026
Resolves the two CI-file conflicts by taking main's side (incl. #1629's
per-runner port-isolation); #1629 is reverted in the next commit and our
nested-dockerd approach re-overlaid. Earthfile keeps both sides (union); the
#1629 slot threading there is undone by the revert.
gilescope added a commit that referenced this pull request Jun 3, 2026
gilescope added a commit that referenced this pull request Jun 3, 2026
With #1629 (per-runner host-port isolation) reverted, re-apply our approach on
the clean base:
  - action.yml: run the whole premerge surface via earthly -P +local-env-ci
    (stack -> finality -> e2e -> toolkit) inside nested dockerd; keeps #1631's
    docker/login-action v4.2.0 bump.
  - continuous-integration.yml: drop the local-environment-tests-self-hosted
    concurrency group (netns isolation removes host-port collisions). Preserves
    #1631 (login-action) and #1604 (Ledger9 toolkit-js step comments).

Net vs main: the two approaches to shielded-sre#281 are mutually exclusive;
this branch chooses nested-dockerd netns isolation over per-runner port blocks.

Assisted-by: Claude:claude-opus-4-8
gilescope added a commit that referenced this pull request Jun 3, 2026
Replace the host-daemon local-env stack with one that runs entirely inside
earthly's nested dockerd (WITH DOCKER): stack bring-up -> verify-finality ->
e2e suite -> toolkit multi-dest E2E, all against one chain in one RUN. Each
invocation gets its own network namespace, so concurrent PRs no longer collide
on host ports and the repo-wide local-environment-tests-self-hosted
serialization is dropped.

New Earthfile targets:
  - +local-env-ci             registry --pull (CI; needs GHCR creds + tags)
  - +local-env-full-ci-localimg  docker save->load (permissionless, no registry)
  - +local-env-oneshot        zero-arg: builds node/toolkit/indexers, runs all
Each carries the host-parity the host LOCALLY path got from .envrc/worktree
(COPY res/ + the reserve-contracts submodule + scripts/ + static/,
MIDNIGHT_RESERVE_CONTRACTS_PATH, ARCHITECTURE=linux/$USERARCH) and kind
preconditions for missing image refs / unchecked-out submodules.

The composite action now just calls `earthly -P +local-env-ci`; the workflow
drops the concurrency group. This supersedes the per-runner host-port-isolation
approach (#1629), which is reverted here: ports.ts removed and run.ts/stop.ts/
discoverValidators.ts/config.rs/check-health.sh/toolkit-multi-dest-e2e.sh +
the compose host-port parameterisation restored to their pre-#1629 form.

Assisted-by: Claude:claude-opus-4-8
gilescope added a commit that referenced this pull request Jun 3, 2026
Replace the host-daemon local-env stack with one that runs entirely inside
earthly's nested dockerd (WITH DOCKER): stack bring-up -> verify-finality ->
e2e suite -> toolkit multi-dest E2E, all against one chain in one RUN. Each
invocation gets its own network namespace, so concurrent PRs no longer collide
on host ports and the repo-wide local-environment-tests-self-hosted
serialization is dropped.

New Earthfile targets:
  - +local-env-ci             registry --pull (CI; needs GHCR creds + tags)
  - +local-env-full-ci-localimg  docker save->load (permissionless, no registry)
  - +local-env-oneshot        zero-arg: builds node/toolkit/indexers, runs all
Each carries the host-parity the host LOCALLY path got from .envrc/worktree
(COPY res/ + the reserve-contracts submodule + scripts/ + static/,
MIDNIGHT_RESERVE_CONTRACTS_PATH, ARCHITECTURE=linux/$USERARCH) and kind
preconditions for missing image refs / unchecked-out submodules.

The composite action now just calls `earthly -P +local-env-ci`; the workflow
drops the concurrency group. This supersedes the per-runner host-port-isolation
approach (#1629), which is reverted here: ports.ts removed and run.ts/stop.ts/
discoverValidators.ts/config.rs/check-health.sh/toolkit-multi-dest-e2e.sh +
the compose host-port parameterisation restored to their pre-#1629 form.

Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Giles Cope <gilescope@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants