refactor(prompt): compress codeSystemPrompt (-51%, ~3.1k tokens/request) by esengine · Pull Request #1323 · esengine/DeepSeek-Reasonix

esengine · 2026-05-19T10:38:26Z

Summary

The system prompt had grown to 24,387 bytes (≈ 6,100 tokens) — every byte paid on every request. Much of it overlapped with tool descriptions sitting right next to it in the cache prefix: "When to propose a plan", "When to ask the user to pick", and "When to track multi-step intent" each recited rules the tool's own description already carried.

Aggressive dedup pass:

Drop the redundant "you have these filesystem tools" opening sentence — the API ships the tool list separately.
Merge the three independent submit_plan / ask_choice / todo_write sections into one short "Picking the right tool" block.
Fold "Exploration", "Trust what you already know", and "When the user wants to switch project" into shorter equivalents — same rules, no narrative.
Collapse the foreground/background section — full how-to lives in the run_command / run_background tool descriptions; the prompt only needs the picking rule.
Compress prose around the six audit-mode rails (prompt: harden audit-mode reasoning against known failure modes #610). Every rail's load-bearing phrase is preserved verbatim so tests/code-prompt.test.ts still asserts on them.

Result: 24,387 → 11,956 bytes (-51%, ≈ 3,100 tokens per request).

Combined with the regression net (#1320) and the tool-description compression (#1321), the cache-prefix tax is now roughly 16k tokens per request instead of ~36k.

What is preserved

Every test-asserted phrase:

Identity is fixed by this prompt, SOUL.md, not a sub-profile
Auto-preview is for locating, not auditing, range:"A-B"
runtime behavior, current architectural state, whether a plan doc is still accurate
Flag → consumer trace, parallelSafe?: boolean
For inventory claims, grep the flag — don't enumerate from memory
No fabricated percentages, 40-60% tokens
Schema cost is real, tighten prompt / existing tool
MEMORY.md is part of the design space
User-facing ≠ model-facing ≠ library-facing, library exports (src/index.ts), Treating a library export as "dead code"
The semantic_search routing fragment and lifecycle contract are untouched.
The # Project .gitignore and # User System Append blocks still compose after the base in the same order — the cache prefix shape is unchanged.

This is PR #3 of a four-PR token-optimization series.

Test plan

npm run verify — all 230 test files / 3,237 tests pass
tests/code-prompt.test.ts — 26 tests, all green (every load-bearing phrase still matches)
npm run lint clean
npm run typecheck clean

The system prompt was 24,387 bytes (≈ 6,100 tokens) — much of it overlapping with the tool descriptions sitting right next to it in the cache prefix. Sections like "When to propose a plan", "When to ask the user to pick", and "When to track multi-step intent" each recited rules that the tool's own description already carried. Aggressive dedup pass: - Drop the redundant "you have these filesystem tools" opening sentence — the API ships the tool list separately. - Merge the three independent submit_plan / ask_choice / todo_write sections into one short "Picking the right tool" block. - Fold "Exploration", "Trust what you already know", and "When the user wants to switch project" into shorter equivalents — same rules, no narrative. - Collapse the foreground/background section. The full how-to lives in the run_command / run_background tool descriptions; the prompt only needs the picking rule. - Compress the audit-mode rails (#610) prose around the six rails themselves. Every rail's load-bearing phrase is preserved verbatim so tests/code-prompt.test.ts still asserts on them. Result: 24,387 → 11,956 bytes (-51%, ≈ 3,100 tokens per request). Combined with PR #1320 / #1321 the cache-prefix tax per request is now ~16k tokens instead of ~36k. Behaviour unchanged — every rail / gate / mode constraint is still asserted by the existing prompt tests.

…st) (esengine#1323) The system prompt was 24,387 bytes (≈ 6,100 tokens) — much of it overlapping with the tool descriptions sitting right next to it in the cache prefix. Sections like "When to propose a plan", "When to ask the user to pick", and "When to track multi-step intent" each recited rules that the tool's own description already carried. Aggressive dedup pass: - Drop the redundant "you have these filesystem tools" opening sentence — the API ships the tool list separately. - Merge the three independent submit_plan / ask_choice / todo_write sections into one short "Picking the right tool" block. - Fold "Exploration", "Trust what you already know", and "When the user wants to switch project" into shorter equivalents — same rules, no narrative. - Collapse the foreground/background section. The full how-to lives in the run_command / run_background tool descriptions; the prompt only needs the picking rule. - Compress the audit-mode rails (esengine#610) prose around the six rails themselves. Every rail's load-bearing phrase is preserved verbatim so tests/code-prompt.test.ts still asserts on them. Result: 24,387 → 11,956 bytes (-51%, ≈ 3,100 tokens per request). Combined with PR esengine#1320 / esengine#1321 the cache-prefix tax per request is now ~16k tokens instead of ~36k. Behaviour unchanged — every rail / gate / mode constraint is still asserted by the existing prompt tests. Co-authored-by: reasonix <reasonix@deepseek.com>

This was referenced May 19, 2026

feat(read_file): lower outline threshold 512 KiB → 64 KiB (recover 0.46.0 token regression) #1324

Merged

feat: Add Java source code search and decompilation tools. #1319

Closed

esengine merged commit e7fb669 into main May 19, 2026
4 checks passed

esengine deleted the feat/compress-system-prompt branch May 19, 2026 11:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(prompt): compress codeSystemPrompt (-51%, ~3.1k tokens/request)#1323

refactor(prompt): compress codeSystemPrompt (-51%, ~3.1k tokens/request)#1323
esengine merged 1 commit into
mainfrom
feat/compress-system-prompt

esengine commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

esengine commented May 19, 2026

Summary

What is preserved

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant