Skip to content

perf: add contextInjection option to reduce workspace token waste by ~93%#28072

Closed
dsantoreis wants to merge 12 commits into
openclaw:mainfrom
dsantoreis:perf/workspace-injection-optimization
Closed

perf: add contextInjection option to reduce workspace token waste by ~93%#28072
dsantoreis wants to merge 12 commits into
openclaw:mainfrom
dsantoreis:perf/workspace-injection-optimization

Conversation

@dsantoreis

Copy link
Copy Markdown
Contributor

Summary

Adds a new agents.defaults.contextInjection config option that controls when workspace context files (AGENTS.md, SOUL.md, USER.md, etc.) are injected into the system prompt.

Closes #9157

Problem

OpenClaw injects workspace files (~35,600 tokens) into the system prompt on every single message. Over a 100-message conversation, this wastes ~3.4M tokens and ~$1.51 in unnecessary cost. These files are static context that rarely changes during a conversation.

Solution

New config option with two modes:

Mode Behavior Token Impact
"always" (default) Current behavior — inject every message ~35,600 tokens/message
"first-message-only" Inject only on first message of session ~35,600 tokens once, then 0

Configuration

{
  "agents": {
    "defaults": {
      "contextInjection": "first-message-only"
    }
  }
}

Changes

File Change
src/config/zod-schema.agent-defaults.ts Add contextInjection to schema
src/agents/pi-embedded-helpers/bootstrap.ts Add resolveContextInjection() resolver
src/agents/pi-embedded-runner/run/attempt.ts Conditionally skip context file injection
src/config/schema.labels.ts Add UI label
src/config/schema.help.ts Add help text

Design Decisions

  • Default is "always" — no breaking changes, existing behavior preserved
  • Bootstrap metadata still loaded even when skipping injection — needed for workspaceNotes detection and systemPromptReport
  • Session file existence used as the signal (same fs.stat pattern already used elsewhere in attempt.ts)
  • Agent can still read workspace files on demand when using "first-message-only" mode

Impact

  • 93.5% token reduction over conversations when enabled
  • ~$1.51 savings per 100-message session
  • Zero risk — opt-in only, default unchanged
  • 5 files changed, 41 insertions, 1 deletion

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Feb 27, 2026
@greptile-apps

greptile-apps Bot commented Feb 27, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds agents.defaults.contextInjection config option to control when workspace context files (AGENTS.md, SOUL.md, USER.md, etc.) are injected into the system prompt. The feature uses session file existence as a signal to determine first vs subsequent messages.

Key changes:

  • New config option with two modes: "always" (default, preserves current behavior) and "first-message-only" (inject only on first message)
  • Implementation in src/agents/pi-embedded-runner/run/attempt.ts checks if session file exists before resolving bootstrap context, then conditionally clears contextFiles array when skipping injection
  • Bootstrap metadata is preserved even when skipping injection (needed for workspaceNotes detection and systemPromptReport)
  • Clean separation of concerns: schema validation, resolver function, and conditional logic are all properly isolated

Benefits:

  • 93% token reduction over long conversations when using "first-message-only" mode
  • Zero risk to existing deployments (opt-in, safe default)
  • Agent can still read workspace files on demand using the read tool

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • Clean implementation of an opt-in performance optimization feature. Uses safe default ("always") that preserves existing behavior. The logic is straightforward: checks session file existence to determine first vs subsequent messages. Properly handles edge cases with error handling in the fs.stat check. No breaking changes, and the feature is well-documented.
  • No files require special attention

Last reviewed commit: ed98b12

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ed98b12d1e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 57e915c649

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
@bmendonca3

Copy link
Copy Markdown
Contributor

The token-savings idea here is good, but sessionFileExists is a pretty rough proxy for “not the first message.” A resumed/pre-created/aborted session can already have a session file before the first meaningful turn, which would skip context injection too early. I’d consider keying this off transcript contents or a real “has completed first turn” marker instead of file existence alone.

@bmendonca3

Copy link
Copy Markdown
Contributor

I’m not convinced sessionFileExists is a safe proxy for “this is not the first message”. Pre-created session files, aborted runs, or empty repaired sessions can all make the file exist before any real assistant turn was written, which would suppress context injection too early. I’d prefer a regression that proves this only disables injection once the session actually contains prior conversational state, not just a file on disk.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fa9c563fcc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
@darfaz

darfaz commented Feb 27, 2026

Copy link
Copy Markdown

This is a great optimization, and there's a security angle worth considering too.

With workspace files injected on every message, a long conversation means those files (which often contain credentials, API keys, and personal data from AGENTS.md/TOOLS.md/USER.md) are sent to the API 100+ times. Each transmission is an exposure surface — if the API provider is compromised, cached prompts leak.

The first-message-only approach doesn't just save tokens — it reduces the number of times sensitive workspace context crosses the wire by 99%.

For anyone running sensitive workspaces: until this lands, you can also audit what's being sent using ClawMoat's secret scanner to catch credentials that shouldn't be in workspace files at all:

const { scanSecrets } = require('clawmoat');
const result = scanSecrets(fs.readFileSync('TOOLS.md', 'utf8'));
// Catches API keys, tokens, passwords before they enter the context window

+1 on merging this.

@dsantoreis

This comment was marked as spam.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 096c15f188

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b05e8301e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated
Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a6aa1f9f43

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: be242ae6bc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts
@dsantoreis

This comment was marked as spam.

@Svarto

Svarto commented Feb 28, 2026

Copy link
Copy Markdown

+1

@dsantoreis

This comment was marked as spam.

@dsantoreis

This comment was marked as spam.

@gumadeiras

Copy link
Copy Markdown
Member

if we inject workspace context on message 1 and then remove it on message 2, the system prompt changes, so the cache key changes too. that usually means a cache miss right after the first turn. whenever you send more messages you will be getting cache hits and those are very cheap $

removing injection of bootstrap files can also weaken instruction consistency, because later turns no longer include agents/soul/user unless the model reads them again

do any providers you use not have caching? that's the only situation you would save $ but then would also get worst experience for lack of agents/soul/user/core files

@dsantoreis

This comment was marked as spam.

@gumadeiras

Copy link
Copy Markdown
Member

unless i'm missing something, i think injecting core files only on the first message does not really make sense

if that context is removed after turn 1, then later turns behave almost the same as if those core files were never there. in that case, you do not need a new flag, you can just keep core files empty and get the same result; but that also removes the whole point of having AGENTS/SOUL/IDENTITY/etc as persistent guidance, and that is a big part of what makes openclaw different from a generic ai chatbot

@darfaz

darfaz commented Mar 4, 2026

Copy link
Copy Markdown

@dsantoreis The refinement makes a lot of sense — keeping core bootstrap (agents/soul/user) always injected while gating heavier workspace context is the right balance. You get instruction stability + cache friendliness on the core files, and still save the bulk of tokens on workspace payload.

From a security perspective this is actually better than the original approach too: the core bootstrap files (AGENTS.md, SOUL.md, USER.md) rarely contain secrets, but workspace files and daily memory notes often do. Gating those reduces exposure surface while keeping the agent's identity/instructions consistent.

One thought: it might be worth exposing which files fall into "core" vs "workspace" in the config schema, so operators can customize the boundary. Some setups have lightweight workspaces but heavy bootstrap files, and vice versa.

Really clean evolution of the PR 👏

@dsantoreis

This comment was marked as spam.

@cgdusek

This comment was marked as spam.

@dsantoreis

This comment was marked as spam.

@dsantoreis

This comment was marked as spam.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2e1c869732

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/attempt.ts
@dsantoreis

This comment was marked as spam.

@dsantoreis

This comment was marked as spam.

@dsantoreis

This comment was marked as spam.

@dsantoreis

This comment was marked as spam.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0c5c10a4d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread package.json Outdated
@dsantoreis

This comment was marked as spam.

Daniel Reis and others added 11 commits March 5, 2026 08:20
…157)

Add a new `agents.defaults.contextInjection` config option that controls
when workspace context files (AGENTS.md, SOUL.md, USER.md, etc.) are
injected into the system prompt:

- "always" (default): current behavior, injects on every message
- "first-message-only": injects only on the first message of a session

This reduces token usage by ~93% over long conversations (saving ~35,600
tokens per message after the first). The agent can still access workspace
files via the `read` tool when needed.

Configuration:
```json
{
  "agents": {
    "defaults": {
      "contextInjection": "first-message-only"
    }
  }
}
```

No breaking changes — default behavior is unchanged ("always").

Closes #9157
…xt skip

Addresses review feedback from @bmendonca3. The previous approach used
sessionFileExists as a proxy for 'not the first message', which is
unreliable for pre-created, aborted, or repaired sessions.

Now scans the session transcript for actual user messages before
deciding to skip context injection. Uses fast substring check
before any JSON parsing for performance.
Address review feedback: instead of just checking for 'role:user' substring
in the transcript, parse the JSON entry and verify it has actual content.
This prevents false positives from header-only transcripts, orphan entries
left by aborted attempts, or repaired sessions.
- Only scan session JSONL when contextInjection is 'first-message-only',
  avoiding unnecessary O(n) I/O in the default 'always' mode.
- Treat JSON parse failures as skip (continue) rather than positive signal,
  preventing transient file corruption from disabling context injection.
Session JSONL records use nested structure (message.role/message.content)
in the current format. Check both patterns to correctly detect user messages
for the first-message-only context injection mode.
@dsantoreis

This comment was marked as spam.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e6adedaafb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +722 to +725
contextInjectionMode === "first-message-only"
? await sessionHasUserMessages(params.sessionFile)
: false;
const skipContextInjection = contextInjectionMode === "first-message-only" && hasUserMessages;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Base first-turn context gate on active session state

This skipContextInjection decision is made from raw JSONL contents before session normalization, so stale user entries can disable context injection even when the active history for this run has no usable user turn. A common case is a failed/aborted first attempt that leaves a trailing user line: sessionHasUserMessages returns true here, but later in the same function the session is repaired/reset and orphan user turns are dropped before prompting, so the first effective prompt can run without AGENTS/SOUL/USER bootstrap context. Fresh evidence in this commit is the ordering: the new gate is evaluated here before prepareSessionManagerForRun and orphan-user cleanup execute.

Useful? React with 👍 / 👎.

This comment was marked as spam.

asistent-alex added a commit to asistent-alex/openclaw-night-shift that referenced this pull request Apr 24, 2026
- Shorten instructions in task template (8 lines → 1 line)
- Ultra-short MESSAGE dispatch (auto-approve hint + /ptopr command only)
- Timeout 3600s → 900s (15 min instead of 60 min)
- Reduces context window pressure by ~80%

Refs: openclaw/openclaw#56029, openclaw/openclaw#28072
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Performance: Workspace file injection wastes 93.5% of token budget

6 participants