Target Workflow: pelis-agent-factory-advisor
Source report: #1676
Estimated cost per run: $2.02
Total tokens per run: ~1,399K (759K input · 13K output · 627K cache_read)
Cache hit rate: 45.2% (cold on turn 1; warms progressively within the run)
LLM turns: 11 requests
Model: claude-sonnet-4.6 via Copilot endpoint
Session duration: ~4 minutes
Current Configuration
| Setting |
Value |
| Tools loaded |
4 — agentic-workflows, bash (wildcard "*"), web-fetch, cache-memory |
| Network allowed |
github.github.io only |
| Pre-agent steps |
Yes — fetches 4 doc pages from github.github.io/gh-aw |
| Prompt size |
8,576 bytes (~1,165 words) |
| Input growth |
39,950 tokens (turn 1) → 83,324 tokens (turn 11) |
| Turns with heavy output |
Turns 8–10 (1,794 / 2,227 / 4,959 output tokens; 33–88 s latency) |
Recommendations
1. Fix or remove the blocked githubnext/agentics fetch (Step 1.2)
Estimated savings: ~100–150K tokens/run (~8–11%)
Step 1.2 instructs the agent to call web-fetch on https://github.com/githubnext/agentics — but the network allowlist only permits github.github.io. Every attempt is blocked by the AWF firewall, causing the agent to spend 1–2 extra turns handling the 403/connection error and falling back.
Option A (recommended) — pre-fetch the content in a steps: entry:
steps:
- name: Fetch Pelis Agent Factory Docs
id: fetch-docs
run: |
set -o pipefail
BASE="(github.github.io/redacted)
OUTFILE="\$\{GITHUB_WORKSPACE}/.pelis-agent-factory-docs.txt"
: > "$OUTFILE"
for PATH_SUFFIX in \
"/blog/2026-01-12-welcome-to-pelis-agent-factory/" \
"/introduction/overview/" \
"/guides/workflow-patterns/" \
"/guides/best-practices/"; do
echo "### \$\{BASE}\$\{PATH_SUFFIX}" >> "$OUTFILE"
curl -sf "\$\{BASE}\$\{PATH_SUFFIX}" \
| python3 -c "import sys,html,re;t=sys.stdin.read();print(html.unescape(re.sub('<[^>]+>','',t))[:8000])" \
>> "$OUTFILE" 2>/dev/null \
|| echo "(not found)" >> "$OUTFILE"
echo "" >> "$OUTFILE"
done
- name: Fetch Agentics Patterns # NEW
id: fetch-agentics
run: |
curl -sf "https://raw.githubusercontent.com/githubnext/agentics/main/README.md" \
| head -c 8000 > "\$\{GITHUB_WORKSPACE}/.agentics-patterns.txt" \
|| echo "(not available)" > "\$\{GITHUB_WORKSPACE}/.agentics-patterns.txt"
Then add raw.githubusercontent.com to network.allowed and update the prompt to read .agentics-patterns.txt instead of using web-fetch. This also lets you remove the web-fetch tool entirely (see Recommendation 3).
Option B (simpler) — remove Step 1.2: The agentics repo content is supplementary. Deleting the two web-fetch calls for it from the prompt removes the blocked-fetch retry loop with no loss of core functionality, since the cache-memory already persists discovered patterns.
2. Pre-compute repository structure in steps:
Estimated savings: ~100–130K tokens/run (~8–9%)
Phase 2.2 of the prompt instructs the agent to run a series of fully deterministic shell commands (ls -la, find .github/workflows -name "*.md", ls tests/, ls scripts/). These produce the same output every time and add 1–2 LLM turns of bash tool-call overhead (request + response context accumulation).
Move these to a pre-agent step and inject via $GITHUB_ENV or a file:
- name: Collect Repo Structure
id: repo-structure
run: |
{
echo "=== Root files ==="
ls -la
echo ""
echo "=== Agentic workflows ==="
find .github/workflows -name "*.md" -type f | sort
echo ""
echo "=== Tests ==="
ls -la tests/ 2>/dev/null || echo "(no tests/)"
echo ""
echo "=== Scripts ==="
ls -la scripts/ 2>/dev/null || echo "(no scripts/)"
} > "\$\{GITHUB_WORKSPACE}/.repo-structure.txt"
Update the prompt to read .repo-structure.txt instead of running bash commands, and remove or simplify Phase 2.2's bash instructions. This shifts ~2 agent tool-call round-trips into a pre-step, avoiding ~60–70K tokens of input context growth for those turns and their downstream history accumulation.
3. Remove the web-fetch tool
Estimated savings: ~6,000–8,000 tokens/run (~0.5%)
Once Steps 1.1 and 1.2 documentation is pre-fetched (Recommendation 1), the web-fetch tool is no longer needed. Removing it eliminates the tool schema from every turn's context (~600–700 tokens × 11 turns).
In the workflow frontmatter, delete:
# Remove this line:
web-fetch:
4. Reduce prompt verbosity (~40% reduction)
Estimated savings: ~30–40K tokens/run (~3%)
The prompt is 8,576 bytes of highly detailed multi-phase instructions. Turn 1 costs 39,950 input tokens — of which a meaningful chunk is the verbose prompt itself. Much of it can be condensed:
Phase 1 cache check (currently ~600 bytes) can become:
## Phase 1: Learn Pelis Agent Factory Patterns
Check cache-memory for `pelis_docs_hash`. Hash `.pelis-agent-factory-docs.txt`
and `.agentics-patterns.txt`. If unchanged, skip to Phase 2 using cached knowledge.
Otherwise read both files and update the hash in cache-memory.
Phase 2.2 repository analysis (currently ~400 bytes of redundant bash listings) can reference the pre-fetched file:
## Phase 2: Analyze Repository
Pre-computed structure is in `.repo-structure.txt`. Agentic workflow definitions
are in `.github/workflows/*.md`. Review them to understand current automation coverage.
Phase 3 opportunity categories (currently ~600 bytes of bullet lists) can be trimmed to the top 5 most relevant categories for this repo (security automation, test coverage, release, documentation, monitoring).
Estimated prompt reduction: ~3,500–4,000 bytes (~875–1,000 tokens) × carried through 11 turns in context ≈ 30–40K tokens net.
5. Restrict bash to specific commands
Estimated savings: ~2,000–4,000 tokens/run (tool schema reduction)
bash: ["*"] loads schema entries for every bash pattern. Restrict to what the workflow actually uses:
tools:
agentic-workflows:
bash:
- "cat"
- "find"
- "ls"
- "grep"
cache-memory: true
This is also a defence-in-depth improvement (reduces agent's command surface).
Expected Impact
| Metric |
Current |
Projected |
Savings |
| Total tokens/run |
1,399K |
~1,100K |
~-21% |
| Cost/run |
$2.02 |
~$1.60 |
~-$0.42 |
| LLM turns |
11 |
~8–9 |
-2 to -3 |
| Turn 1 input tokens |
39,950 |
~28,000 |
~-30% |
| Cache hit rate |
45.2% |
~50–55% |
+5–10pp |
Savings are conservative estimates. The biggest wins come from Recommendations 1 and 2 (removing wasted turns from blocked fetches and pre-computing deterministic work).
Implementation Checklist
Generated by Daily Copilot Token Optimization Advisor · ● 669.2K · ◷
Target Workflow:
pelis-agent-factory-advisorSource report: #1676
Estimated cost per run: $2.02
Total tokens per run: ~1,399K (759K input · 13K output · 627K cache_read)
Cache hit rate: 45.2% (cold on turn 1; warms progressively within the run)
LLM turns: 11 requests
Model: claude-sonnet-4.6 via Copilot endpoint
Session duration: ~4 minutes
Current Configuration
agentic-workflows,bash(wildcard"*"),web-fetch,cache-memorygithub.github.ioonlygithub.github.io/gh-awRecommendations
1. Fix or remove the blocked
githubnext/agenticsfetch (Step 1.2)Estimated savings: ~100–150K tokens/run (~8–11%)
Step 1.2 instructs the agent to call
web-fetchonhttps://github.com/githubnext/agentics— but the network allowlist only permitsgithub.github.io. Every attempt is blocked by the AWF firewall, causing the agent to spend 1–2 extra turns handling the 403/connection error and falling back.Option A (recommended) — pre-fetch the content in a
steps:entry:Then add
raw.githubusercontent.comtonetwork.allowedand update the prompt to read.agentics-patterns.txtinstead of usingweb-fetch. This also lets you remove theweb-fetchtool entirely (see Recommendation 3).Option B (simpler) — remove Step 1.2: The agentics repo content is supplementary. Deleting the two
web-fetchcalls for it from the prompt removes the blocked-fetch retry loop with no loss of core functionality, since the cache-memory already persists discovered patterns.2. Pre-compute repository structure in
steps:Estimated savings: ~100–130K tokens/run (~8–9%)
Phase 2.2 of the prompt instructs the agent to run a series of fully deterministic shell commands (
ls -la,find .github/workflows -name "*.md",ls tests/,ls scripts/). These produce the same output every time and add 1–2 LLM turns of bash tool-call overhead (request + response context accumulation).Move these to a pre-agent step and inject via
$GITHUB_ENVor a file:Update the prompt to read
.repo-structure.txtinstead of running bash commands, and remove or simplify Phase 2.2's bash instructions. This shifts ~2 agent tool-call round-trips into a pre-step, avoiding ~60–70K tokens of input context growth for those turns and their downstream history accumulation.3. Remove the
web-fetchtoolEstimated savings: ~6,000–8,000 tokens/run (~0.5%)
Once Steps 1.1 and 1.2 documentation is pre-fetched (Recommendation 1), the
web-fetchtool is no longer needed. Removing it eliminates the tool schema from every turn's context (~600–700 tokens × 11 turns).In the workflow frontmatter, delete:
4. Reduce prompt verbosity (~40% reduction)
Estimated savings: ~30–40K tokens/run (~3%)
The prompt is 8,576 bytes of highly detailed multi-phase instructions. Turn 1 costs 39,950 input tokens — of which a meaningful chunk is the verbose prompt itself. Much of it can be condensed:
Phase 1 cache check (currently ~600 bytes) can become:
Phase 2.2 repository analysis (currently ~400 bytes of redundant bash listings) can reference the pre-fetched file:
Phase 3 opportunity categories (currently ~600 bytes of bullet lists) can be trimmed to the top 5 most relevant categories for this repo (security automation, test coverage, release, documentation, monitoring).
Estimated prompt reduction: ~3,500–4,000 bytes (~875–1,000 tokens) × carried through 11 turns in context ≈ 30–40K tokens net.
5. Restrict
bashto specific commandsEstimated savings: ~2,000–4,000 tokens/run (tool schema reduction)
bash: ["*"]loads schema entries for every bash pattern. Restrict to what the workflow actually uses:This is also a defence-in-depth improvement (reduces agent's command surface).
Expected Impact
Implementation Checklist
fetch-agenticspre-step; addraw.githubusercontent.comtonetwork.allowed; remove Step 1.2web-fetchinstructions from promptcollect-repo-structurepre-step; replace Phase 2.2 bash commands with file-read instructionweb-fetch:fromtools:in frontmatter (after Recs 1 & 2 land)bash: ["*"]with specific command allowlistgh-aw compile .github/workflows/pelis-agent-factory-advisor.mdnpx tsx scripts/ci/postprocess-smoke-workflows.tstoken-usage.jsonlon new run vs baseline (target: ≤ 1,100K tokens, ≤ $1.60)