ci(construct): scope BUILDX_BUILDER to the build job, not workflow-wide#266
Merged
Merged
Conversation
The post-merge run of #256 broke the construct image's `publish-manifest` job: ERROR: no builder "jackin-construct" found error: Recipe `construct-publish-manifest` failed with exit code 1 Cause: #256 hoisted `BUILDX_BUILDER: jackin-construct` into the workflow-level `env:` block when it dropped the redundant `setup-buildx-action`/`Bootstrap buildx` pair from `publish-manifest`. That env var is honored by `docker buildx` as the default-builder selection, so on `publish-manifest` (which intentionally does not create a builder, since `docker buildx imagetools create`/`inspect` are registry-side ops) buildx looks up the named builder, fails to find it, and the recipe exits 1. Fix: move `BUILDX_BUILDER` out of the workflow-level `env:` block into the `build` job's job-level `env:` block, where it actually belongs. `publish-manifest` no longer sees the env var, falls back to the default builder, and `imagetools` runs as designed. Verified locally: yaml parses, the env var only flows into the build job. The `build` job's `uses: setup-buildx-action with: name: ${{ env.BUILDX_BUILDER }}` still resolves correctly because step-level `env` lookups walk the job-level `env` block first. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
donbeave
added a commit
that referenced
this pull request
May 9, 2026
… rules (#267) Codify two CI-merge-safety rules in `PULL_REQUESTS.md` plus a one-line cross-reference in `AGENTS.md`. The recent #256 → #266 incident exposed a class of break invisible to PR-time CI: a workflow-level env var (`BUILDX_BUILDER`) leaked into the `publish-manifest` job, where docker buildx read it as the default-builder selection and failed at runtime — but only on push to main, because that job is push-only and never runs on a `pull_request` event. Three rules now live in `PULL_REQUESTS.md`. First, third-party-CLI env vars (`BUILDX_BUILDER`, `DOCKER_BUILDKIT`, `GH_TOKEN`/`GITHUB_TOKEN`, `KUBECONFIG`, `AWS_PROFILE`, `RUSTUP_TOOLCHAIN`, `npm_config_*`) must be scoped to the consuming job; workflow-level `env:` is reserved for in-house naming/paths. Second, registry / production publishing must hard-gate on main — the `is_publish` flag pattern from `construct.yml` is the canonical shape, and PRs / feature-branch dispatches must build-and-verify only, never publish. Third, push-only / main-only / `workflow_run`-gated jobs (`build-validator`, `publish-manifest`, `deploy`, `publish-preview`) must be smoke-tested via `gh workflow run --ref <feature-branch>` before requesting merge, since PR-time CI structurally cannot exercise them. A companion terraform PR (jackin-project/jackin-github-terraform#14) raises the matching protection-rule changes — required-status-check list, single approving review, strict up-to-date branch policy. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
donbeave
added a commit
that referenced
this pull request
May 9, 2026
… env-var leaks (#273) Tier 3 follow-up to the post-#266 hardening. Add a PR-time `publish-manifest-rehearsal` job to `construct.yml` that runs the same docker buildx CLI plumbing `publish-manifest` depends on, without the side-effect steps (Docker Hub login, `imagetools create`, `imagetools inspect`). The previous hardening rounds were process- (#267 docs rules) and gate-side (#14 / #15 terraform branch protection). This is the workflow-side companion. The #266 break (`ERROR: no builder "jackin-construct" found`) was caused by hoisting `BUILDX_BUILDER` into workflow-level env, which docker buildx reads as the default-builder selection — surfacing only post-merge on main because `publish-manifest` is push-only and never runs on a `pull_request` event. `docker buildx ls` evaluates `BUILDX_BUILDER` at startup and exits non-zero on a missing-builder reference, regardless of whether the rest of the command would have hit the network. Running it on PR + feature-branch dispatch reproduces the failure shape without registry credentials. Two checks: `docker buildx ls` (catches workflow-level env-var leaks for any current or future buildx-controlling env var) and `docker buildx imagetools --help >/dev/null` (smoke-test that the buildx CLI plugin is bundled and loadable on the runner). The `construct-required` aggregator now lists `publish-manifest-rehearsal` in `needs:` so the rehearsal's result is rolled up into the same single check name branch protection requires. Equivalent rehearsals for `build-validator`, `deploy`, and `publish-preview` are deferred — those side-effect surfaces don't have the same network-free reproducibility shape `buildx ls` provides for docker, so the fidelity-vs-cost trade is worse. Reopen if a #266-class break surfaces in any of them. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
donbeave
added a commit
that referenced
this pull request
May 18, 2026
…de (#266) Hotfix the construct-image break introduced by #256. After that PR merged, the `Construct Image / publish manifest` job on `main` failed with `ERROR: no builder "jackin-construct" found` because the refactor hoisted `BUILDX_BUILDER: jackin-construct` into the workflow-level `env:` block when it dropped the redundant `setup-buildx-action` / `Bootstrap buildx` pair from `publish-manifest`. `BUILDX_BUILDER` is read by the `docker buildx` CLI as the default-builder selection, so on `publish-manifest` (which intentionally creates no builder, since `imagetools create` / `inspect` are registry-side operations) every `docker buildx` invocation looked up a builder by that name and exited. Move `BUILDX_BUILDER` out of workflow-level `env:` and into the `build` job's job-level `env:` block where the matching `setup-buildx-action` actually creates the builder. `publish-manifest` no longer sees the env var, falls back to the default builder, and `imagetools` runs as designed. The break did not surface in PR-time CI because `publish-manifest` is gated to `push to main || (workflow_dispatch && main)` — neither runs on a `pull_request` event, so the publish path was effectively untested before merge. Two follow-up PRs harden against the same class of bug: a documentation PR adding a workflow-env-scoping rule and a smoke-test-from-feature-branch rule to PULL_REQUESTS.md, and a terraform PR raising the required-status-check list for jackin to include the new aggregator jobs. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
donbeave
added a commit
that referenced
this pull request
May 18, 2026
… rules (#267) Codify two CI-merge-safety rules in `PULL_REQUESTS.md` plus a one-line cross-reference in `AGENTS.md`. The recent #256 → #266 incident exposed a class of break invisible to PR-time CI: a workflow-level env var (`BUILDX_BUILDER`) leaked into the `publish-manifest` job, where docker buildx read it as the default-builder selection and failed at runtime — but only on push to main, because that job is push-only and never runs on a `pull_request` event. Three rules now live in `PULL_REQUESTS.md`. First, third-party-CLI env vars (`BUILDX_BUILDER`, `DOCKER_BUILDKIT`, `GH_TOKEN`/`GITHUB_TOKEN`, `KUBECONFIG`, `AWS_PROFILE`, `RUSTUP_TOOLCHAIN`, `npm_config_*`) must be scoped to the consuming job; workflow-level `env:` is reserved for in-house naming/paths. Second, registry / production publishing must hard-gate on main — the `is_publish` flag pattern from `construct.yml` is the canonical shape, and PRs / feature-branch dispatches must build-and-verify only, never publish. Third, push-only / main-only / `workflow_run`-gated jobs (`build-validator`, `publish-manifest`, `deploy`, `publish-preview`) must be smoke-tested via `gh workflow run --ref <feature-branch>` before requesting merge, since PR-time CI structurally cannot exercise them. A companion terraform PR (jackin-project/jackin-github-terraform#14) raises the matching protection-rule changes — required-status-check list, single approving review, strict up-to-date branch policy. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
donbeave
added a commit
that referenced
this pull request
May 18, 2026
… env-var leaks (#273) Tier 3 follow-up to the post-#266 hardening. Add a PR-time `publish-manifest-rehearsal` job to `construct.yml` that runs the same docker buildx CLI plumbing `publish-manifest` depends on, without the side-effect steps (Docker Hub login, `imagetools create`, `imagetools inspect`). The previous hardening rounds were process- (#267 docs rules) and gate-side (#14 / #15 terraform branch protection). This is the workflow-side companion. The #266 break (`ERROR: no builder "jackin-construct" found`) was caused by hoisting `BUILDX_BUILDER` into workflow-level env, which docker buildx reads as the default-builder selection — surfacing only post-merge on main because `publish-manifest` is push-only and never runs on a `pull_request` event. `docker buildx ls` evaluates `BUILDX_BUILDER` at startup and exits non-zero on a missing-builder reference, regardless of whether the rest of the command would have hit the network. Running it on PR + feature-branch dispatch reproduces the failure shape without registry credentials. Two checks: `docker buildx ls` (catches workflow-level env-var leaks for any current or future buildx-controlling env var) and `docker buildx imagetools --help >/dev/null` (smoke-test that the buildx CLI plugin is bundled and loadable on the runner). The `construct-required` aggregator now lists `publish-manifest-rehearsal` in `needs:` so the rehearsal's result is rolled up into the same single check name branch protection requires. Equivalent rehearsals for `build-validator`, `deploy`, and `publish-preview` are deferred — those side-effect surfaces don't have the same network-free reproducibility shape `buildx ls` provides for docker, so the fidelity-vs-cost trade is worse. Reopen if a #266-class break surfaces in any of them. Signed-off-by: Alexey Zhokhov <alexey@zhokhov.com> Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Hotfix the construct-image workflow break introduced by #256. After that PR merged, the
Construct Image / publish manifestjob onmainfailed with:Root cause
#256 hoisted
BUILDX_BUILDER: jackin-constructinto the workflow-levelenv:block when it dropped the redundantsetup-buildx-action/Bootstrap buildxpair from thepublish-manifestjob (correct on the merits —docker buildx imagetools create/inspectare registry-side ops that need no local builder). The unintended consequence:BUILDX_BUILDERis read by the docker buildx CLI as the default-builder selection. With the env var present and no builder of that name created in thepublish-manifestjob, everydocker buildxinvocation fails with "no builder found".The break did not surface in PR CI because
publish-manifestis gated topush to main || (workflow_dispatch && main)— neither runs on apull_requestevent, so the publish path was effectively untested before merge.Fix
Move
BUILDX_BUILDERout of the workflow-levelenv:block into thebuildjob's job-levelenv:block.publish-manifestno longer sees the env var, falls back to the default builder, andimagetoolsruns as designed. Thebuildjob'ssetup-buildx-action with: name: ${{ env.BUILDX_BUILDER }}resolves correctly because step-levelenvlookups walk the job-level env first.Verify locally
Checkout
export TIRITH=0Static checks
The push to
mainafter this PR merges will be the actual fidelity test —publish-manifestshould run and succeed.Hardening follow-ups
The class of break (env-var leak into a third-party CLI's default-selection semantics, surfacing only on push-to-main) was invisible to PR-time CI and to all three review passes. Two hardening PRs follow this one to make the same mistake catchable in PR next time:
jackinPR adding workflow-env-scoping guidance toAGENTS.mdand a "smoke-test push-only jobs viagh workflow runfrom the feature branch" rule toPULL_REQUESTS.md.jackin-github-terraformPR expanding the required-status-check list forjackinto include the newci-required,construct-required,docs-required, andvalidateaggregators, and turning onstrict_required_status_checks_policyplus a single required reviewer.