perf: add contextInjection option to reduce workspace token waste by ~93%#28072
perf: add contextInjection option to reduce workspace token waste by ~93%#28072dsantoreis wants to merge 12 commits into
Conversation
Greptile SummaryAdds Key changes:
Benefits:
Confidence Score: 5/5
Last reviewed commit: ed98b12 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ed98b12d1e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 57e915c649
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
The token-savings idea here is good, but |
|
I’m not convinced |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fa9c563fcc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
This is a great optimization, and there's a security angle worth considering too. With workspace files injected on every message, a long conversation means those files (which often contain credentials, API keys, and personal data from AGENTS.md/TOOLS.md/USER.md) are sent to the API 100+ times. Each transmission is an exposure surface — if the API provider is compromised, cached prompts leak. The For anyone running sensitive workspaces: until this lands, you can also audit what's being sent using ClawMoat's secret scanner to catch credentials that shouldn't be in workspace files at all: const { scanSecrets } = require('clawmoat');
const result = scanSecrets(fs.readFileSync('TOOLS.md', 'utf8'));
// Catches API keys, tokens, passwords before they enter the context window+1 on merging this. |
This comment was marked as spam.
This comment was marked as spam.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 096c15f188
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b05e8301e4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a6aa1f9f43
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: be242ae6bc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
This comment was marked as spam.
This comment was marked as spam.
|
+1 |
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
|
if we inject workspace context on message 1 and then remove it on message 2, the system prompt changes, so the cache key changes too. that usually means a cache miss right after the first turn. whenever you send more messages you will be getting cache hits and those are very cheap $ removing injection of bootstrap files can also weaken instruction consistency, because later turns no longer include agents/soul/user unless the model reads them again do any providers you use not have caching? that's the only situation you would save $ but then would also get worst experience for lack of agents/soul/user/core files |
This comment was marked as spam.
This comment was marked as spam.
|
unless i'm missing something, i think injecting core files only on the first message does not really make sense if that context is removed after turn 1, then later turns behave almost the same as if those core files were never there. in that case, you do not need a new flag, you can just keep core files empty and get the same result; but that also removes the whole point of having AGENTS/SOUL/IDENTITY/etc as persistent guidance, and that is a big part of what makes openclaw different from a generic ai chatbot |
|
@dsantoreis The refinement makes a lot of sense — keeping core bootstrap (agents/soul/user) always injected while gating heavier workspace context is the right balance. You get instruction stability + cache friendliness on the core files, and still save the bulk of tokens on workspace payload. From a security perspective this is actually better than the original approach too: the core bootstrap files (AGENTS.md, SOUL.md, USER.md) rarely contain secrets, but workspace files and daily memory notes often do. Gating those reduces exposure surface while keeping the agent's identity/instructions consistent. One thought: it might be worth exposing which files fall into "core" vs "workspace" in the config schema, so operators can customize the boundary. Some setups have lightweight workspaces but heavy bootstrap files, and vice versa. Really clean evolution of the PR 👏 |
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2e1c869732
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
This comment was marked as spam.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0c5c10a4d3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
This comment was marked as spam.
This comment was marked as spam.
…157)
Add a new `agents.defaults.contextInjection` config option that controls
when workspace context files (AGENTS.md, SOUL.md, USER.md, etc.) are
injected into the system prompt:
- "always" (default): current behavior, injects on every message
- "first-message-only": injects only on the first message of a session
This reduces token usage by ~93% over long conversations (saving ~35,600
tokens per message after the first). The agent can still access workspace
files via the `read` tool when needed.
Configuration:
```json
{
"agents": {
"defaults": {
"contextInjection": "first-message-only"
}
}
}
```
No breaking changes — default behavior is unchanged ("always").
Closes #9157
…xt skip Addresses review feedback from @bmendonca3. The previous approach used sessionFileExists as a proxy for 'not the first message', which is unreliable for pre-created, aborted, or repaired sessions. Now scans the session transcript for actual user messages before deciding to skip context injection. Uses fast substring check before any JSON parsing for performance.
Address review feedback: instead of just checking for 'role:user' substring in the transcript, parse the JSON entry and verify it has actual content. This prevents false positives from header-only transcripts, orphan entries left by aborted attempts, or repaired sessions.
- Only scan session JSONL when contextInjection is 'first-message-only', avoiding unnecessary O(n) I/O in the default 'always' mode. - Treat JSON parse failures as skip (continue) rather than positive signal, preventing transient file corruption from disabling context injection.
Session JSONL records use nested structure (message.role/message.content) in the current format. Check both patterns to correctly detect user messages for the first-message-only context injection mode.
… intentionally skipped
This comment was marked as spam.
This comment was marked as spam.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e6adedaafb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| contextInjectionMode === "first-message-only" | ||
| ? await sessionHasUserMessages(params.sessionFile) | ||
| : false; | ||
| const skipContextInjection = contextInjectionMode === "first-message-only" && hasUserMessages; |
There was a problem hiding this comment.
Base first-turn context gate on active session state
This skipContextInjection decision is made from raw JSONL contents before session normalization, so stale user entries can disable context injection even when the active history for this run has no usable user turn. A common case is a failed/aborted first attempt that leaves a trailing user line: sessionHasUserMessages returns true here, but later in the same function the session is repaired/reset and orphan user turns are dropped before prompting, so the first effective prompt can run without AGENTS/SOUL/USER bootstrap context. Fresh evidence in this commit is the ordering: the new gate is evaluated here before prepareSessionManagerForRun and orphan-user cleanup execute.
Useful? React with 👍 / 👎.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
- Shorten instructions in task template (8 lines → 1 line) - Ultra-short MESSAGE dispatch (auto-approve hint + /ptopr command only) - Timeout 3600s → 900s (15 min instead of 60 min) - Reduces context window pressure by ~80% Refs: openclaw/openclaw#56029, openclaw/openclaw#28072
Summary
Adds a new
agents.defaults.contextInjectionconfig option that controls when workspace context files (AGENTS.md, SOUL.md, USER.md, etc.) are injected into the system prompt.Closes #9157
Problem
OpenClaw injects workspace files (~35,600 tokens) into the system prompt on every single message. Over a 100-message conversation, this wastes ~3.4M tokens and ~$1.51 in unnecessary cost. These files are static context that rarely changes during a conversation.
Solution
New config option with two modes:
"always"(default)"first-message-only"Configuration
{ "agents": { "defaults": { "contextInjection": "first-message-only" } } }Changes
src/config/zod-schema.agent-defaults.tscontextInjectionto schemasrc/agents/pi-embedded-helpers/bootstrap.tsresolveContextInjection()resolversrc/agents/pi-embedded-runner/run/attempt.tssrc/config/schema.labels.tssrc/config/schema.help.tsDesign Decisions
"always"— no breaking changes, existing behavior preservedworkspaceNotesdetection andsystemPromptReportfs.statpattern already used elsewhere inattempt.ts)readworkspace files on demand when using"first-message-only"modeImpact