Codex context-engine projection caps LCM output to 24k chars, hiding full-fit context

## Summary

Native Codex app-server runs can silently expose only a small slice of context-engine output because `projectContextEngineAssemblyForCodex()` renders assembled context into a quoted prompt block capped at 24,000 chars.

This is a real context-delivery regression for context engines such as Lossless Claw/LCM after switching an agent from the Pi embedded route to the native Codex route. LCM can assemble a large, full-fit frontier, but Codex receives only the capped projection. The visible `/status` or UI percent then reflects the smaller Codex runtime prompt usage, which makes healthy LCM state look like it lost context or overcompacted.

## Confirmed Evidence

Environment:

- OpenClaw `2026.5.10-beta.5`
- runtime: native `codex` harness / `openai-codex/gpt-5.5`
- session key: `agent:main:main`
- context window denominator: `258000`

Local observed run after beta switch:

- LCM active conversation remained healthy:
  - `conversation_id=1872`
  - `session_key=agent:main:main`
  - `27811` messages
  - `14241378` raw message tokens
  - frontier: `247` context items, about `186764` tokens
- LCM assembled a full-fit frontier before the turn:
  - `contextItems=215`
  - `selectionMode=full-fit`
  - `estimatedTokens=179134`
  - `rawMessageCount=93`
  - `summaryCount=122`
- The same Codex runtime turn reported only:
  - `input=590`
  - `output=6`
  - `cacheRead=76288`
  - `totalTokens=76884`
  - visible status about `30%`

There was no LCM compaction at that turn:

- `shouldCompact=false`
- reason: `below-context-threshold-floor`

So the observed drop was not LCM DB loss and not overcompaction. It is the native Codex projection/accounting boundary.

## Code Path

The native Codex route assembles context-engine output here:

- `extensions/codex/src/app-server/run-attempt.ts`
  - calls `assembleHarnessContextEngine(...)`
  - then calls `projectContextEngineAssemblyForCodex(...)`

The projection cap is here:

- `extensions/codex/src/app-server/context-engine-projection.ts`
  - `MAX_RENDERED_CONTEXT_CHARS = 24_000`
  - `MAX_TEXT_PART_CHARS = 6_000`
  - `truncateText(renderedContext, MAX_RENDERED_CONTEXT_CHARS)`

This means a context engine may select hundreds of thousands of tokens, but native Codex only sees a 24k-character rendered block.

By contrast, the Pi embedded route feeds assembled context-engine messages through the Pi session flow rather than this small Codex text projection:

- `src/agents/pi-embedded-runner/run/attempt.ts`
  - calls `assembleAttemptContextEngine(...)`
  - applies assembled messages to the active Pi session

## Reproduction Shape

1. Use a context-engine plugin with a large existing frontier, such as Lossless Claw.
2. Run the same long-lived session through native Codex app-server runtime.
3. Observe LCM/context-engine assemble logs reporting a large full-fit selection.
4. Observe Codex runtime usage/status reporting a much smaller prompt.
5. Inspect `extensions/codex/src/app-server/context-engine-projection.ts` and confirm the 24k-character projection cap.

## Expected Behavior

When a context engine assembles context within the model budget, native Codex should either:

- pass a model-visible projection sized from the actual model/context budget, or
- explicitly report that the host projected/truncated the context-engine output and how much was dropped.

The model-visible context should not be silently capped to 24k chars while the context engine believes it delivered a full-fit frontier.

## Actual Behavior

The context engine assembles a large full-fit frontier, but native Codex receives a small rendered quoted-context block capped at 24k chars. Status then reports the smaller Codex runtime usage, making it appear as if context dropped from about 80% to about 30%.

## Fix Direction

Recommended upstream fix:

1. Replace the fixed `24_000` rendered-context cap with a budget derived from the active model/context budget.
2. Reserve space for:
   - developer instructions / workspace bootstrap
   - current user prompt
   - tool schemas
   - model output margin
3. Return and record projection stats:
   - assembled message count
   - rendered chars before cap
   - rendered chars after cap
   - truncation boolean
   - approximate dropped chars/tokens
4. Include those stats in trajectory/status diagnostics so `/status` can distinguish:
   - context-engine frontier available
   - context-engine projected to model
   - provider-reported prompt/cache usage

Even a conservative first patch that makes the cap configurable and records truncation metadata would prevent this from being misdiagnosed as LCM data loss.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Codex context-engine projection caps LCM output to 24k chars, hiding full-fit context #80760

Summary

Confirmed Evidence

Code Path

Reproduction Shape

Expected Behavior

Actual Behavior

Fix Direction

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Codex context-engine projection caps LCM output to 24k chars, hiding full-fit context #80760

Description

Summary

Confirmed Evidence

Code Path

Reproduction Shape

Expected Behavior

Actual Behavior

Fix Direction

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions