Skip to content

ci: gate base-action integration tests on secret + repo identity#3

Merged
nllptrx merged 14 commits into
giteafrom
ci/gate-integration-tests
Apr 22, 2026
Merged

ci: gate base-action integration tests on secret + repo identity#3
nllptrx merged 14 commits into
giteafrom
ci/gate-integration-tests

Conversation

@nllptrx

@nllptrx nllptrx commented Apr 22, 2026

Copy link
Copy Markdown
Owner

Summary

Follow-up to #2. The base-action integration workflows (test-*.yml) were inherited from upstream anthropics/claude-code-action and assume ANTHROPIC_API_KEY / CLAUDE_CREDENTIALS is always present. On this fork those secrets may be unset, so 31 CI jobs failed red on every PR with a pre-auth environment-validation error that had nothing to do with the PR's content.

Three layered cost/noise controls:

  1. Secret + repo-identity gate (job-level if:)

    • Skip on fork PRs (secrets unavailable anyway)
    • Skip cleanly when neither CLAUDE_CREDENTIALS nor ANTHROPIC_API_KEY is set
    • Outcome: forks without the secret show ~16 skipped checks instead of 31 failed ones
  2. paths: filter (workflow-level)

    • Only trigger when base-action/**, the workflow file, package.json, bun.lock, or tsconfig.json change
    • Docs-only, Gitea-action-only, or hooks-only PRs skip base-action tests entirely
    • workflow_dispatch remains available for manual runs
  3. concurrency: cancel-in-progress (workflow-level)

    • Group per workflow + branch
    • Rapid force-pushes on a PR cancel the in-flight run instead of running all N in parallel
    • Typical 5–10 force-push PR iteration drops billed runs by ~80%

Preserves if: always() on the structured-output summary job by composing: always() && <gate>.

Test plan

  • CI runs after merge → base-action tests skipped (no secret on fork) → checks show as skipped not failed
  • Push a docs-only PR → no base-action tests trigger at all (paths filter)
  • Force-push this PR rapidly → earlier run cancels (concurrency)
  • Set CLAUDE_CREDENTIALS on the fork → tests actually run on a subsequent push
  • Trigger via Actions UI (workflow_dispatch) → always runs regardless of paths

nllptrx added 2 commits April 22, 2026 22:14
…entity

Forks and PRs from forks can't access ANTHROPIC_API_KEY / CLAUDE_CREDENTIALS,
so these 31 jobs were failing red instead of skipping cleanly. Gate adds two
guards per job:

1. Same-repo PRs only — blocks fork PRs entirely (they can't access secrets
   anyway, but this prevents even runner-minute cost for the trivially-
   failing pre-auth check).
2. Secret presence — cleanly skip when neither CLAUDE_CREDENTIALS nor
   ANTHROPIC_API_KEY is set on the fork.

Gated jobs:
- test-base-action.yml: test-inline-prompt, test-prompt-file
- test-claude-env.yml: test-claude-env-with-comments
- test-custom-executables.yml: test-custom-executables
- test-mcp-servers.yml: test-mcp-integration, test-mcp-config-flag
- test-settings.yml: 4x settings jobs
- test-structured-output.yml: 5x type jobs + summary (preserves always()
  so summary still runs when tests fail, but still gated)

After this, forks without the secret show ~16 "skipped" checks instead of
31 red X's, and upstream anthropics's fork-PR cost vector is closed.
Two additional cost-prevention measures on top of the secret + repo-identity
gate from the previous commit:

1. paths: filter — only trigger integration tests when base-action/, the
   workflow file itself, or Bun/TS config changes. Non-base-action PRs (docs,
   Gitea action code, hooks, etc.) skip the tests entirely. Preserves manual
   coverage via workflow_dispatch.

2. concurrency: group + cancel-in-progress — rapid force-pushes on the same
   branch cancel in-flight runs instead of paying for all of them. Typical
   PR iteration (5-10 force-pushes during review) drops from ~31*N billed
   runs to ~31 billed runs.

Test-claude-env.yml also filters its push-to-main trigger to match (only run
when base-action-affecting files land on main).
Copilot AI review requested due to automatic review settings April 22, 2026 20:16

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the inherited base-action integration test workflows to reduce CI noise/cost on forks and in repos without Anthropic/Claude credentials by adding conditional gating, path-based triggering, and run-cancellation concurrency.

Changes:

  • Add workflow-level paths: filters so base-action integration tests only run when relevant files change.
  • Add workflow-level concurrency: cancel-in-progress to cancel superseded runs per workflow+ref.
  • Add job-level if: gates to skip jobs on fork PRs and when required secrets are not present (including composing always() && <gate> for the summary job).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
.github/workflows/test-structured-output.yml Adds paths filter, concurrency, and secret/repo-identity job gating (incl. gated always() summary).
.github/workflows/test-settings.yml Adds paths filter, concurrency, and secret/repo-identity job gating.
.github/workflows/test-mcp-servers.yml Adds paths filter, concurrency, and secret/repo-identity job gating.
.github/workflows/test-custom-executables.yml Adds paths filter, concurrency, and secret/repo-identity job gating.
.github/workflows/test-claude-env.yml Adds push/pull_request paths filters, concurrency, and secret/repo-identity job gating.
.github/workflows/test-base-action.yml Adds paths filter, concurrency, and secret/repo-identity job gating.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/test-structured-output.yml Outdated
nllptrx added 10 commits April 22, 2026 22:20
ci-all.yml calls the test-* workflows via workflow_call. When every called
sub-workflow has all jobs skipped (fork, missing secret), ci-all itself
lands in a degenerate state: conclusion=failure, total_count=0 jobs.
That failure doesn't surface on the PR's statusCheckRollup (GitHub drops
zero-job runs from PR checks) but clutters the Actions tab with misleading
red entries.

Apply the same gate to each reusable-workflow job so the sub-call is
skipped at the ci-all level instead of at each grandchild job. ci-all
then only runs what actually runs (ci.yml → test/prettier/typecheck) and
reports honest pass/fail.

Also adds concurrency cancel-in-progress to ci-all so rapid force-pushes
don't pile up multiple orchestration runs.
Copilot review on PR #3 caught that this workflow was the only one passing
only anthropic_api_key without claude_code_oauth_token. With an OAuth-only
repo config (CLAUDE_CREDENTIALS set, ANTHROPIC_API_KEY unset) the new gate
lets the job run but base-action env validation fails at the pre-auth step.

Align with the other test-*.yml workflows: pass both, with ANTHROPIC_API_KEY
taking precedence when both are set and CLAUDE_CREDENTIALS feeding the
oauth token input.
Workflow dispatch revealed that every test-*.yml + ci-all.yml was rejected
at parse time:

  Unrecognized named-value: 'secrets'.
  (secrets.CLAUDE_CREDENTIALS != '' || secrets.ANTHROPIC_API_KEY != '')

GitHub Actions does NOT allow the `secrets` context in job-level `if:`
expressions. The earlier "0 jobs, conclusion=failure" runs I attributed
to degenerate state were actually parse failures — the workflows never
dispatched. The fork-check half of the gate was never evaluated.

Keep only the fork-check half, which uses `github.event.pull_request.*`
and is valid:

    if: >-
      github.event_name != 'pull_request' ||
      github.event.pull_request.head.repo.full_name == github.repository

This still closes the fork-PR cost vector (fork PRs skip). It no longer
skip-cleans when secrets are unset on the main fork — in that case base-
action's env validator fails red, same as before the gate. Acceptable
because `CLAUDE_CREDENTIALS` is now set on nllptrx/claude-code-action.

Also fix the test-structured-output summary `if:` where the bulk rewrite
produced wrong operator precedence: `always() && A || B` means `(always()
&& A) || B`. Add explicit parens.

If we later want skip-clean-on-missing-secret, the correct pattern is a
preflight job that outputs a boolean via GITHUB_OUTPUT, then `needs: +
if: needs.preflight.outputs.has_creds` on each downstream job. Not
needed for now.
Upstream pattern:
  anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY && secrets.ANTHROPIC_API_KEY || secrets.CLAUDE_CREDENTIALS }}
  claude_code_oauth_token: ${{ secrets.CLAUDE_CREDENTIALS }}

Assumed CLAUDE_CREDENTIALS is a backup **API key**. With an OAuth token in
CLAUDE_CREDENTIALS (current nllptrx config), both inputs received the
OAuth string — base-action's Anthropic client picked it up as x-api-key,
API returned "Invalid API key". All 6 dispatched runs failed on
SDK-execution.

Feed only ANTHROPIC_API_KEY into anthropic_api_key, OAuth stays in its
own channel. base-action's dual-path env setup handles the rest.
When base-action was synced from upstream, claude_env was dropped from
action.yml (upstream removed it in the SDK switchover). We kept the
parseClaudeEnv helper in run-claude.ts as a fork delta but never:

1. Re-declared claude_env in base-action/action.yml — GitHub Actions
   only emits INPUT_* env vars for declared inputs, so callers got a
   silent warning "Unexpected input(s) 'claude_env'" and the value
   was dropped before reaching our code.

2. Wired INPUT_CLAUDE_ENV into the runClaude() call in index.ts —
   the helper lived dead downstream.

Both halves restored. Fixes test-claude-env integration failure on
PR #3 (job 72583417897), where VAR1/VAR2 were never injected into
the Claude process and the echo assertion failed.

548 tests still pass.
Upstream dropped the allowed_tools / timeout_minutes inputs from
base-action; they get silently ignored as "Unexpected input(s)". The
Bash tool defaults to permission-denied without an explicit grant, so
the echo test never executed and VAR1/VAR2 weren't printed to the
transcript — grep found nothing and the assertion failed even though
claude_env injection itself was working.

Move the permission grant into claude_args, which IS a declared input.
Drop timeout_minutes entirely (SDK-level timeout isn't exposed as an
action input; the composite action's shell-level timeout was removed
with the CLI-subprocess switch).

Verified the claude_env fix separately: the warning output on the
previous run confirms claude_env was accepted and parsed ("Comments in
claude_env handled correctly" was echoed); only the Bash permission
blocked the actual variable expansion from landing in the log.
Composite-action run: steps don't auto-surface INPUT_* env vars from
declared inputs — each has to be mapped into the step's env: block.
I added the input declaration earlier (commit d96b464) but missed
wiring it here, so process.env.INPUT_CLAUDE_ENV was undefined,
parseClaudeEnv returned {}, and Claude's Bash tool saw empty
\$VAR1 / \$VAR2.

Diagnostic dump from run 24801530252 confirmed:

  tool_use:    "command": "echo \"VAR1: \$VAR1\" && echo \"VAR2: \$VAR2\""
  tool_result: "content": "VAR1: \nVAR2:"

With this wired, the test's grep for value1/value2 should finally hit.

Also drops the temporary diagnostic dump from test-claude-env.yml
now that it's served its purpose.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 11 comments.

Comments suppressed due to low confidence (1)

.github/workflows/test-custom-executables.yml:68

  • allowed_tools isn’t a declared input in base-action/action.yml, so this test currently isn’t actually constraining tools the way it intends. Move tool allowlisting into claude_args (e.g. --allowedTools ...) so the test validates the supported interface.
          path_to_claude_code_executable: /home/runner/.local/bin/claude
          path_to_bun_executable: /home/runner/.bun/bin/bun
          allowed_tools: "LS,Read"


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/test-settings.yml
Comment thread .github/workflows/test-base-action.yml Outdated
Comment thread .github/workflows/test-mcp-servers.yml
Comment thread .github/workflows/test-structured-output.yml
Comment thread .github/workflows/test-claude-env.yml
Comment thread .github/workflows/ci-all.yml
Comment thread .github/workflows/test-settings.yml Outdated
Comment thread .github/workflows/test-base-action.yml
Comment thread .github/workflows/test-custom-executables.yml
Comment thread .github/workflows/test-mcp-servers.yml Outdated
nllptrx added 2 commits April 22, 2026 22:56
…ll paths

Copilot surfaced 12 comments on PR #3. Addressing the 5 that touch real
test integrity or real workflow behavior:

1. ci-all.yml — add pull_request paths filter at the orchestrator level.
   Individual test-*.yml already have paths filters, but workflow_call
   from ci-all bypasses the callee's filter (only the caller's triggers
   matter). Without this, docs-only PRs re-invoke every integration test
   via ci-all despite each callee filtering correctly when triggered
   directly.

2. test-mcp-servers.yml — port mcp_config input to claude_args.
   mcp_config is not a declared base-action input, so the test-mcp-config-flag
   job was silently dropping its entire reason for existing (testing the
   --mcp-config CLI flag). Now it actually passes the config.

3. test-base-action.yml (2 steps) — port allowed_tools and drop timeout_minutes.
   Same half-merge as mcp_config: neither is declared on base-action, so
   both were ignored. Tests passed by luck because LS/Read are default-
   permitted. Using claude_args makes the tool allowlist enforced and
   drops the noisy "Unexpected input" warning.

4. test-custom-executables.yml — same allowed_tools port.

5. test-settings.yml (4 steps) — drop timeout_minutes.
   The settings JSON handles all permission behavior in this workflow;
   timeout_minutes was pure dead-letter noise.

Not applied (7+ Copilot comments): all variants of "add
secrets.X != '' to job-level if:". GitHub Actions rejects the secrets
context in job-level if: at parse time — empirically verified earlier
in this PR (commit 5912558 message has the error text). A preflight
job emitting a $GITHUB_OUTPUT boolean is the syntactically correct
workaround but heavier than warranted here, since (a) fork PRs are
already repo-identity-gated, and (b) the main repo not having secrets
configured is a one-time setup issue that benefits more from a clear
red env-validation failure than a silent skip.
@nllptrx nllptrx merged commit 1ecbc1b into gitea Apr 22, 2026
25 checks passed
@nllptrx nllptrx deleted the ci/gate-integration-tests branch April 22, 2026 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants