Skip to content

[Bug]: Codex native compaction can leave post-compaction context usage stale or unknown #66263

@cyrusaf

Description

@cyrusaf

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

After native Codex app-server compaction, OpenClaw can lack a reliable post-compaction token count. When that happens, status/context reporting should explicitly treat usage as unknown or bounded, not revive stale pre-compaction usage.

Parent tracker: #66251

Steps to reproduce

  1. Start a session using the native Codex harness.
  2. Grow the session context enough that context usage is visible.
  3. Trigger native Codex compaction.
  4. Inspect session status/context usage after compaction.
  5. Observe whether OpenClaw reports old/pre-compaction usage as if it were the current fresh context usage.

Expected behavior

After Codex native compaction:

  • if the app-server reports a reliable post-compaction token count, OpenClaw records and displays it
  • if the app-server does not report tokensAfter, OpenClaw marks context usage as unknown/stale rather than pretending the old count is fresh
  • transcript fallback logic respects a compaction boundary and does not resurrect pre-compaction usage
  • status output avoids misleading values such as extreme or stale context percentages after compaction

Actual behavior

Native Codex compaction may complete without a reliable post-compaction token snapshot.

In that case, OpenClaw can fall back to stale transcript-derived or cumulative usage data and display it as current context usage. This makes compaction appear ineffective or can create confusing context reports after a successful compaction.

OpenClaw version

2026.4.10 / 2026.4.11 reports, plus current main investigation

Operating system

macOS and Linux reports

Install method

npm/global CLI and local source investigation

Model

codex/gpt-5.4

Provider / routing chain

OpenClaw -> bundled codex plugin -> Codex app-server

Additional provider/model setup details

NOT_ENOUGH_INFO

Logs, screenshots, and evidence

Scope boundary

This issue is specifically about native Codex app-server compaction and post-compaction context accounting.

Cumulative app-server token projection is related but partly tracked separately in #64669. This issue should focus on compaction boundaries, tokensAfter, stale/unknown state, and status reporting after native compaction.

Acceptance criteria

  • Codex native compaction captures tokensAfter when the app-server provides it.
  • If tokensAfter is unavailable, OpenClaw records post-compaction context usage as unknown/stale rather than fresh.
  • A compaction marker or equivalent boundary prevents transcript fallback from using pre-compaction usage as current usage.
  • Status/session reporting preserves the unknown state instead of fabricating a precise current percentage.
  • Tests cover both known and unknown post-compaction token counts.

Suggested test coverage

  • Codex app-server compaction returns tokensAfter when available.
  • Codex app-server compaction records an unknown context state when tokensAfter is absent.
  • Session/status reporting does not display stale pre-compaction usage as fresh.
  • Transcript-derived usage ignores data before the native compaction boundary.
  • Existing cumulative token handling from fix: [codex-harness] avoid treating cumulative app-server usage as current context #64669 remains compatible.

Related

Impact and severity

Affected: Users running OpenClaw with the bundled Codex harness.

Severity: High for long-running Codex sessions that rely on compaction/status reporting.

Frequency: Observed during Codex harness stabilization investigation.

Consequence: Users can see misleading context usage after compaction, making it difficult to tell whether compaction succeeded or whether a session is safe to continue.

Additional information

The local investigation has in-progress changes in this area. This issue is intended to capture the observed failure mode and the acceptance criteria for a focused fix.

Metadata

Metadata

Assignees

Labels

P2Normal backlog priority with limited blast radius.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions