Skip to content

perf: reduce smoke-copilot token usage with pre-steps and tool trimming#1613

Merged
lpcox merged 3 commits intomainfrom
opt/smoke-copilot-token-reduction
Apr 2, 2026
Merged

perf: reduce smoke-copilot token usage with pre-steps and tool trimming#1613
lpcox merged 3 commits intomainfrom
opt/smoke-copilot-token-reduction

Conversation

@lpcox
Copy link
Copy Markdown
Collaborator

@lpcox lpcox commented Apr 2, 2026

Summary

Reduces smoke-copilot token usage by an estimated 60-70% per run through four optimizations identified in #1604.

Changes

1. Tool Surface Reduction (est. ~30K tokens saved)

Removed 6 unused tool groups, keeping only what the workflow actually uses:

  • Removed: agentic-workflows, cache-memory, edit, playwright, web-fetch
  • Kept: bash, github (restricted to [repos, pull_requests] toolsets)
  • Previously: 56 tools loaded (~651 tokens each for schemas)
  • Now: ~10 tools (bash + github repos/PRs + safeoutputs)

2. Pre-Agent Steps (moves work out of LLM context)

Added a steps: block that runs before the agent, outside the firewall sandbox:

  • Fetches last 2 merged PRs via gh pr list (deterministic, no LLM needed)
  • Checks github.com connectivity via curl (replaces Playwright browser test)
  • Creates and reads back test file (deterministic file I/O test)
  • Results exported as step outputs for agent context

3. Simplified Agent Prompt

Agent now just needs to:

  • Verify one GitHub MCP call works (connectivity proof)
  • Read pre-computed results
  • Format a brief summary comment
  • Call safe-outputs

4. Network & Config Cleanup

  • Removed unused network groups: node, playwright
  • Removed sandbox.mcp.container (not allowed in strict mode)
  • Fixed safe-outputs YAML indentation (was incorrectly nested under sandbox)

Token Impact Analysis

Metric Before After (est.)
Tool schemas loaded ~56 tools (~36K tokens) ~10 tools (~6.5K tokens)
Tool probing turn 75 tool calls (Turn 1 wasted) Eliminated
Playwright overhead 21 tools + network group Removed (curl in pre-step)
Deterministic work All done by LLM Moved to pre-agent step
Total per-run tokens ~210K ~70-80K (est.)

Ref: #1604

- Remove unused tools: agentic-workflows, cache-memory, edit, playwright, web-fetch
  (56 tools → ~10, saving ~30K tokens/request from schema overhead)
- Remove unused network groups: node, playwright
- Restrict github toolsets to [repos, pull_requests]
- Add pre-agent step to deterministically pre-compute:
  - Last 2 merged PRs via gh CLI
  - GitHub.com connectivity check via curl
  - File write/read test
- Simplify agent prompt to verify pre-computed results and format output
- Replace Playwright browser test with curl connectivity check in pre-step
- Fix safe-outputs YAML indentation (was over-indented)
- Remove sandbox.mcp.container (not allowed in strict mode)

Expected token reduction: ~60-70% per run by eliminating tool probing
turn, reducing tool schema overhead, and moving deterministic work
out of the LLM context.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@lpcox lpcox requested a review from Mossaka as a code owner April 2, 2026 18:00
Copilot AI review requested due to automatic review settings April 2, 2026 18:00
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

✅ Coverage Check Passed

Overall Coverage

Metric Base PR Delta
Lines 85.81% 85.91% 📈 +0.10%
Statements 85.69% 85.79% 📈 +0.10%
Functions 86.71% 86.71% ➡️ +0.00%
Branches 78.50% 78.55% 📈 +0.05%
📁 Per-file Coverage Changes (1 files)
File Lines (Before → After) Statements (Before → After)
src/docker-manager.ts 86.1% → 86.5% (+0.40%) 85.6% → 86.0% (+0.39%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Smoke Test Results

Test Status
GitHub MCP (last 2 merged PRs: "feat: add Daily Claude Token Usage Analyzer workflow", "fix: extract OpenAI/Copilot cached_tokens from prompt_tokens_details")
Playwright (github.com title contains "GitHub")
File write (smoke-test-claude-23914557569.txt)
Bash verify (cat confirmed content)

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1613

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Smoke Copilot agentic workflow to significantly reduce token usage by shrinking the tool surface, moving deterministic checks into pre-agent steps, and simplifying the agent prompt.

Changes:

  • Reduced configured MCP/tool surface (removed unused tool groups; restrict GitHub MCP toolsets).
  • Added a pre-agent steps: block to pre-compute smoke-test data (merged PR list, github.com connectivity, file IO check) and reference results in the prompt.
  • Updated the compiled workflow lockfile to gh-aw v0.65.3, reflecting the new tool/network configuration and safe-outputs wiring.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
.github/workflows/smoke-copilot.md Trims tools/network, adds pre-agent steps, and rewrites the prompt to use pre-computed results.
.github/workflows/smoke-copilot.lock.yml Regenerates the compiled workflow with the new configuration and updated gh-aw version.
Comments suppressed due to low confidence (1)

.github/workflows/smoke-copilot.lock.yml:345

  • The smoke-data step runs in the agent job and sets step outputs, but the prompt artifact is generated earlier in activation. As a result, the step outputs can’t influence the agent prompt (and other jobs can’t consume them unless written to a file/artifact). Prefer persisting the computed results to a known file under /tmp/gh-aw and referencing that file in the prompt, or relocate smoke-data into the activation job before prompt generation.
      - name: Configure gh CLI for GitHub Enterprise
        run: bash ${RUNNER_TEMP}/gh-aw/actions/configure_gh_for_ghe.sh
        env:
          GH_TOKEN: ${{ github.token }}
      - env:
          GH_TOKEN: ${{ github.token }}
        id: smoke-data
        name: Pre-compute smoke test data
        run: "echo \"::group::Fetching last 2 merged PRs\"\nPR_DATA=$(gh pr list --repo \"$GITHUB_REPOSITORY\" --state merged --limit 2 \\\n  --json number,title,author,mergedAt \\\n  --jq '.[] | \"PR #\\(.number): \\(.title) (by @\\(.author.login), merged \\(.mergedAt))\"')\necho \"$PR_DATA\"\necho \"::endgroup::\"\n\necho \"::group::GitHub.com connectivity check\"\nHTTP_CODE=$(curl -s -o /dev/null -w \"%{http_code}\" --max-time 10 https://github.com)\necho \"github.com returned HTTP $HTTP_CODE\"\necho \"::endgroup::\"\n\necho \"::group::File write/read test\"\nTEST_DIR=\"/tmp/gh-aw/agent\"\nTEST_FILE=\"$TEST_DIR/smoke-test-copilot-${GITHUB_RUN_ID}.txt\"\nmkdir -p \"$TEST_DIR\"\necho \"Smoke test passed for Copilot at $(date)\" > \"$TEST_FILE\"\nFILE_CONTENT=$(cat \"$TEST_FILE\")\necho \"Wrote and read back: $FILE_CONTENT\"\necho \"::endgroup::\"\n\n# Export results for agent context\n{\n  echo \"SMOKE_PR_DATA<<SMOKE_EOF\"\n  echo \"$PR_DATA\"\n  echo \"SMOKE_EOF\"\n  echo \"SMOKE_HTTP_CODE=$HTTP_CODE\"\n  echo \"SMOKE_FILE_CONTENT=$FILE_CONTENT\"\n  echo \"SMOKE_FILE_PATH=$TEST_FILE\"\n} >> \"$GITHUB_OUTPUT\"\n"


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +120
### 1. GitHub MCP Testing
The last 2 merged pull requests have been fetched. Verify MCP connectivity by calling `github-list_pull_requests` for ${{ github.repository }} (limit 1, state merged) and confirm data is returned.

### 2. GitHub.com Connectivity
Pre-step result: HTTP ${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }} from github.com.
✅ if HTTP 200 or 301, ❌ otherwise.

### 3. File Write/Read Test
Pre-step wrote and read back: "${{ steps.smoke-data.outputs.SMOKE_FILE_CONTENT }}"
File path: ${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}
Verify by running `cat` on the file path using bash to confirm it exists.

### 4. Bash Tool Testing
Run a simple bash command (e.g., `echo "bash works"`) to verify the bash tool is functional.

## Pre-Fetched PR Data

1. **GitHub MCP Testing**: Review the last 2 merged pull requests in ${{ github.repository }}
2. **Playwright Testing**: Use playwright to navigate to https://github.com and verify the page title contains "GitHub"
3. **File Writing Testing**: Create a test file `/tmp/gh-aw/agent/smoke-test-copilot-${{ github.run_id }}.txt` with content "Smoke test passed for Copilot at $(date)" (create the directory if it doesn't exist)
4. **Bash Tool Testing**: Execute bash commands to verify file creation was successful (use `cat` to read the file back)
```
${{ steps.smoke-data.outputs.SMOKE_PR_DATA }}
```
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prompt references ${{ steps.smoke-data.outputs.* }} (HTTP code, file content/path, PR data). In gh-aw compiled workflows the prompt is rendered during the activation job, before the agent job runs steps:; these steps.* outputs will be empty and the agent will see missing data. Consider writing the pre-computed results to a deterministic file path (e.g., /tmp/gh-aw/smoke-data.env) and instructing the agent to cat it, or move the pre-compute step into the same job that renders the prompt (activation) so the values can be injected.

See below for a potential fix:

A pre-step has stored the HTTP status code from github.com in `/tmp/gh-aw/smoke-data.env` under the key `SMOKE_HTTP_CODE`.
Use the bash tool to read this value (for example, by running `cat /tmp/gh-aw/smoke-data.env`) and interpret it as:
✅ if HTTP 200 or 301, ❌ otherwise.

### 3. File Write/Read Test
The same pre-step recorded `SMOKE_FILE_CONTENT` and `SMOKE_FILE_PATH` in `/tmp/gh-aw/smoke-data.env`.
Use bash to read these values from the env file, then run `cat` on the reported file path to confirm the file exists and contains the expected content.

### 4. Bash Tool Testing
Run a simple bash command (e.g., `echo "bash works"`) to verify the bash tool is functional.

## Pre-Fetched PR Data

A pre-step has written pre-fetched pull request data to `/tmp/gh-aw/smoke-pr-data.json`.
Use bash to inspect this file (for example, by running `cat /tmp/gh-aw/smoke-pr-data.json`) and use that data when summarizing PR titles and statuses.

Copilot uses AI. Check for mistakes.
Comment on lines +159 to +162
GH_AW_EXPR_2805DAC9: ${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}
GH_AW_EXPR_7EA93000: ${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }}
GH_AW_EXPR_ABDF8D58: ${{ steps.smoke-data.outputs.SMOKE_PR_DATA }}
GH_AW_EXPR_EC16C26C: ${{ steps.smoke-data.outputs.SMOKE_FILE_CONTENT }}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the activation job, Create prompt with built-in context exports GH_AW_EXPR_* values from ${{ steps.smoke-data.outputs.* }}, but there is no smoke-data step in the activation job. This will render the prompt with empty placeholders. To fix, run the smoke-data step in activation before prompt creation (or stop injecting step outputs into the prompt and have the agent read results from a file created in the agent job).

This issue also appears on line 336 of the same file.

Suggested change
GH_AW_EXPR_2805DAC9: ${{ steps.smoke-data.outputs.SMOKE_FILE_PATH }}
GH_AW_EXPR_7EA93000: ${{ steps.smoke-data.outputs.SMOKE_HTTP_CODE }}
GH_AW_EXPR_ABDF8D58: ${{ steps.smoke-data.outputs.SMOKE_PR_DATA }}
GH_AW_EXPR_EC16C26C: ${{ steps.smoke-data.outputs.SMOKE_FILE_CONTENT }}
GH_AW_EXPR_2805DAC9: ""
GH_AW_EXPR_7EA93000: ""
GH_AW_EXPR_ABDF8D58: ""
GH_AW_EXPR_EC16C26C: ""

Copilot uses AI. Check for mistakes.
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

This was referenced Apr 2, 2026
@github-actions

This comment has been minimized.

The gh-aw v0.65+ compiler no longer sets GH_AW_SAFE_OUTPUTS as a
job-level env var (it uses step outputs instead). The post-step
validation was falling back to /opt/gh-aw/safeoutputs/outputs.jsonl
which doesn't exist. Use ${RUNNER_TEMP}/gh-aw/safeoutputs/outputs.jsonl
as the fallback instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Smoke Test Results — PASS

GitHub MCP: PR #1517 "fix: prevent Squid config injection via --allow-domains and --allow-urls", PR #1446 "fix: add picomatch overrides for GHSA-c2c7-rcm5-vvqj"
Playwright: github.com title contains "GitHub"
File Write: /tmp/gh-aw/agent/smoke-test-claude-23915028354.txt created
Bash: File content verified via cat

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1613

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

🏗️ Build Test Suite Results

⚠️ ALL CLONES FAILEDgh CLI is not authenticated (no GH_TOKEN set). Unable to clone any test repositories.

Ecosystem Project Build/Install Tests Status
Bun elysia N/A ❌ CLONE_FAILED
Bun hono N/A ❌ CLONE_FAILED
C++ fmt N/A ❌ CLONE_FAILED
C++ json N/A ❌ CLONE_FAILED
Deno oak N/A ❌ CLONE_FAILED
Deno std N/A ❌ CLONE_FAILED
.NET hello-world N/A ❌ CLONE_FAILED
.NET json-parse N/A ❌ CLONE_FAILED
Go color N/A ❌ CLONE_FAILED
Go env N/A ❌ CLONE_FAILED
Go uuid N/A ❌ CLONE_FAILED
Java gson N/A ❌ CLONE_FAILED
Java caffeine N/A ❌ CLONE_FAILED
Node.js clsx N/A ❌ CLONE_FAILED
Node.js execa N/A ❌ CLONE_FAILED
Node.js p-limit N/A ❌ CLONE_FAILED
Rust fd N/A ❌ CLONE_FAILED
Rust zoxide N/A ❌ CLONE_FAILED

Overall: 0/8 ecosystems passed — ❌ FAIL

Error: gh: To use GitHub CLI in a GitHub Actions workflow, set the GH_TOKEN environment variable.

All repository clones failed because GH_TOKEN is not set in this workflow environment. To fix this, ensure the workflow passes GH_TOKEN (e.g., ${{ github.token }}) to the agent environment so gh repo clone can authenticate.

Generated by Build Test Suite for issue #1613 ·

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

The post-processor's checkout injection logic was fooled by the
conditional 'Checkout repository for patch context' step in the
detection job. Since it has an 'if:' guard, it may be skipped,
leaving npm ci without a package-lock.json.

Fix the regex to skip conditional checkouts (those with an 'if:'
on the next line) so an unconditional checkout is properly injected.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Smoke Test Results

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude for issue #1613

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

🔥 Smoke Test Results

Test Result
GitHub MCP ✅ PR list returned
GitHub.com HTTP ✅ Connectivity confirmed
File write/read smoke-test-copilot-23915369631.txt verified
Bash tool ✅ Working

Overall: PASS

PR by @lpcox · Reviewer: @Mossaka

📰 BREAKING: Report filed by Smoke Copilot

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Smoke Test: GitHub Actions Services Connectivity ✅

Service Check Result
Redis PING host.docker.internal:6379 PONG
PostgreSQL pg_isready host.docker.internal:5432 ✅ accepting connections
PostgreSQL SELECT 1 on smoketest db as postgres ✅ returned 1

All checks passed. (redis-cli unavailable; Redis verified via raw TCP socket.)

🔌 Service connectivity validated by Smoke Services

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

Smoke Test Results (Codex)

  • PR titles: "fix: prevent Squid config injection via --allow-domains and --allow-urls"; "feat: add Daily Claude Token Usage Analyzer workflow"
  • GitHub MCP merged PR review: ✅
  • safeinputs-gh PR query: ❌ (tool unavailable)
  • Playwright github.com title check: ❌ (MCP EACCES)
  • Tavily web search: ❌ (tool unavailable)
  • File write/read + bash cat: ✅
  • Build (npm ci && npm run build): ✅
  • Discussion query/comment: ❌ (tool unavailable)
    Overall status: FAIL

🔮 The oracle has spoken through Smoke Codex

@lpcox lpcox merged commit 9fc23b7 into main Apr 2, 2026
57 of 60 checks passed
@lpcox lpcox deleted the opt/smoke-copilot-token-reduction branch April 2, 2026 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants