ci(local-env): per-runner port isolation for concurrent Local Environment Tests#1629
Conversation
bb69b58 to
d760399
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bb69b58bab
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…ment Tests
The self-hosted CI host runs up to 8 runner slots against one shared Docker
daemon. The local-env stack published fixed host ports and used fixed container
names, so concurrent jobs collided ("port is already allocated" / name
conflict) and the job was serialized repo-wide with a fixed concurrency group,
throttling every PR.
Give each runner slot a disjoint host-port block, a unique compose project
name, and a container-name suffix, all derived from a single
LOCALENV_RUNNER_SLOT (parsed from $RUNNER_NAME). Slot 0 (the default, and any
non-self-hosted runner) reproduces the legacy single-tenant layout exactly, so
local developer workflows are unchanged.
- local-environment/src/lib/ports.ts: source-of-truth port/layout computation
- run.ts / stop.ts: inject the slot's project name + host ports into compose
- discoverValidators.ts: resolve ${VAR:-default} host ports from the env
- docker-compose.yml: parameterize every published host port; suffix every
container_name (container-internal ports unchanged)
- tests/e2e/src/config.rs, check-health.sh, toolkit-multi-dest-e2e.sh: honor
the slot's host ports (legacy defaults preserved)
- Earthfile: thread LOCALENV_RUNNER_SLOT into the LOCALLY bring-up and pass the
node/ogmios host ports into the hermetic local-env-e2e target
- action.yml: compute the slot + ports (bash mirror of ports.ts); drop the now
obsolete shared-host teardown workaround
- continuous-integration.yml: remove the repo-wide serialization gate
Refs shieldedtech/shielded-sre#281
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Scott Buckel <scott.buckel@shielded.io>
d760399 to
655e591
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 655e5919c9
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
With #1629 (per-runner host-port isolation) reverted, re-apply our approach on the clean base: - action.yml: run the whole premerge surface via earthly -P +local-env-ci (stack -> finality -> e2e -> toolkit) inside nested dockerd; keeps #1631's docker/login-action v4.2.0 bump. - continuous-integration.yml: drop the local-environment-tests-self-hosted concurrency group (netns isolation removes host-port collisions). Preserves #1631 (login-action) and #1604 (Ledger9 toolkit-js step comments). Net vs main: the two approaches to shielded-sre#281 are mutually exclusive; this branch chooses nested-dockerd netns isolation over per-runner port blocks. Assisted-by: Claude:claude-opus-4-8
Replace the host-daemon local-env stack with one that runs entirely inside earthly's nested dockerd (WITH DOCKER): stack bring-up -> verify-finality -> e2e suite -> toolkit multi-dest E2E, all against one chain in one RUN. Each invocation gets its own network namespace, so concurrent PRs no longer collide on host ports and the repo-wide local-environment-tests-self-hosted serialization is dropped. New Earthfile targets: - +local-env-ci registry --pull (CI; needs GHCR creds + tags) - +local-env-full-ci-localimg docker save->load (permissionless, no registry) - +local-env-oneshot zero-arg: builds node/toolkit/indexers, runs all Each carries the host-parity the host LOCALLY path got from .envrc/worktree (COPY res/ + the reserve-contracts submodule + scripts/ + static/, MIDNIGHT_RESERVE_CONTRACTS_PATH, ARCHITECTURE=linux/$USERARCH) and kind preconditions for missing image refs / unchecked-out submodules. The composite action now just calls `earthly -P +local-env-ci`; the workflow drops the concurrency group. This supersedes the per-runner host-port-isolation approach (#1629), which is reverted here: ports.ts removed and run.ts/stop.ts/ discoverValidators.ts/config.rs/check-health.sh/toolkit-multi-dest-e2e.sh + the compose host-port parameterisation restored to their pre-#1629 form. Assisted-by: Claude:claude-opus-4-8
Replace the host-daemon local-env stack with one that runs entirely inside earthly's nested dockerd (WITH DOCKER): stack bring-up -> verify-finality -> e2e suite -> toolkit multi-dest E2E, all against one chain in one RUN. Each invocation gets its own network namespace, so concurrent PRs no longer collide on host ports and the repo-wide local-environment-tests-self-hosted serialization is dropped. New Earthfile targets: - +local-env-ci registry --pull (CI; needs GHCR creds + tags) - +local-env-full-ci-localimg docker save->load (permissionless, no registry) - +local-env-oneshot zero-arg: builds node/toolkit/indexers, runs all Each carries the host-parity the host LOCALLY path got from .envrc/worktree (COPY res/ + the reserve-contracts submodule + scripts/ + static/, MIDNIGHT_RESERVE_CONTRACTS_PATH, ARCHITECTURE=linux/$USERARCH) and kind preconditions for missing image refs / unchecked-out submodules. The composite action now just calls `earthly -P +local-env-ci`; the workflow drops the concurrency group. This supersedes the per-runner host-port-isolation approach (#1629), which is reverted here: ports.ts removed and run.ts/stop.ts/ discoverValidators.ts/config.rs/check-health.sh/toolkit-multi-dest-e2e.sh + the compose host-port parameterisation restored to their pre-#1629 form. Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Giles Cope <gilescope@gmail.com>
Summary
Lets
Local Environment Testsrun concurrently on the shared self-hosted host instead of being serialized repo-wide. Refs shieldedtech/shielded-sre#281.The host runs up to 8 runner slots against one Docker daemon. The local-env stack published fixed host ports (9933–9944, 30333–30337, 1337/1442/5432/8088/4222/30000/32000) and used fixed
container_names, so two jobs on the same host collided. The job was therefore gated by a fixedconcurrencygroup, throttling every PR.This gives each runner slot a disjoint host-port block + unique compose project name + container-name suffix, all derived from a single
LOCALENV_RUNNER_SLOTparsed from$RUNNER_NAME(e.g.fsn1-runner-01-runner-3→ slot 3). Slot 0 (the default, and any non-self-hosted runner) reproduces the legacy single-tenant layout byte-for-byte, so local dev and existing tooling are unchanged.local-environment/src/lib/ports.tsis the source of truth (BASE_PORT=21000,BLOCK=64, 22-entry orderedPORT_SPEC). Container-internal ports never change — only host bindings shift — so intra-stack DNS/P2P is untouched. Compose project naming isolates networks and named volumes automatically.Changes
ports.ts— slot → {projectName, nameSuffix, hostPorts}run.ts/stop.ts— inject the slot layout into the compose env (bring-up + teardown target the same project)discoverValidators.ts— resolve${VAR:-default}host ports from the env (incl. fixing a:-in-${...}split bug)docker-compose.yml— every published host port parameterized; everycontainer_namesuffixedtests/e2e/src/config.rs,check-health.sh,toolkit-multi-dest-e2e.sh— honor the slot's ports (legacy fallbacks)Earthfile— threadLOCALENV_RUNNER_SLOTinto theLOCALLYbring-up; pass node/ogmios host ports into the hermeticlocal-env-e2etargetaction.yml— compute slot + ports (bash mirror ofports.ts); drop the obsolete shared-host teardown workaroundcontinuous-integration.yml— remove the repo-wide serialization gateTesting
Static validation only — full integration is CI-only (needs node images + the self-hosted host + earthly):
tsc --noEmit,eslint,prettier --check: cleandocker compose config: slot 0 → legacy names/ports (midnight-node-1, 9933); slot 3 → isolated (midnight-node-1-r3, 21136, project/volumes/networklocal-env-r3_*)ports.tsslots 1–8 disjoint + in-range;discoverValidatorsresolves env ports with legacy fallbackaction.ymlcross-checked identical toports.ts(slots 1/3/8)shellcheckon both scripts: cleancargo check -p midnight-node-e2e --features local-ci --tests: cleanNeeds CI to prove: two PRs actually running
Local Environment Testsconcurrently without collision, and the e2e step connecting to the right slot's ports.Notes / follow-ups
E2E_*_PORTas earthly build-args means thelocal-env-e2ecargo layer is cache-keyed per slot (up to 8 variants). Acceptable; can be optimized later if cache pressure shows.governance-runtime-upgradecommand's default--rpc-url(9944) is not slot-aware — out of scope (not in the tests job); pass--rpc-urlexplicitly if running it against a slotted stack.DOCKER_CONFIGisolation step is intentionally kept — it guards the shared-~/.dockerlogin race and is still required under concurrency.🤖 Generated with Claude Code