Codex app-server rotates context-engine bootstrap threads after large first turns

## Summary

On the current `main` branch and the latest stable release I verified (`v2026.5.22`, published 2026-05-24), Codex app-server sessions can repeatedly lose their warmed native thread after a large context-engine bootstrap turn. The observed release log shape is:

```text
codex app-server native transcript exceeded active token limit; starting a fresh thread
```

This is not Discord losing messages and it is not the model context limit. It is OpenClaw clearing the saved Codex app-server native thread binding before the context-engine compatibility path can decide whether that thread is still valid.

## Impact

For long-running Codex-backed agents with `contextEngine` projection mode `thread_bootstrap`, a large first/bootstrap native turn can exceed the local 70k native active-token guard. Once that happens, each later turn can cold-start the native Codex thread instead of using `thread/resume`, causing repeated bootstrap/projection work and loss of the warmed app-server fast path.

That matches the token/latency symptom we are seeing in long Discord sessions: the Gateway still routes the turn, but the Codex-side native wrapper repeatedly starts fresh threads and burns tokens/CPU.

## Root Cause

`extensions/codex/src/app-server/run-attempt.ts` calls `rotateOversizedCodexAppServerStartupBinding(...)` immediately after reading the startup binding. That helper reads the native Codex rollout/session token stats and clears the binding when the latest usage is at or above `CODEX_APP_SERVER_NATIVE_THREAD_MAX_TOKENS` (`70_000`).

For context-engine `thread_bootstrap`, that ordering is wrong: the bootstrap turn is expected to be large, and later turns should be able to reuse the same native thread as long as the stored context-engine projection metadata still matches the current engine/policy/epoch. The later context-engine reuse logic already knows how to decide whether the binding is compatible, but it never gets the chance because the startup guard deletes the binding first.

## Expected Behavior

A saved Codex native thread binding with `contextEngine.projection.mode === "thread_bootstrap"` should survive the startup native transcript size guard. Compatibility should then be decided by the context-engine projection/epoch checks and the existing per-turn overflow recovery path. If the epoch or policy changes, OpenClaw should still rotate and reproject.

## Proposed Fix

Defer the startup native token/byte guard for context-engine `thread_bootstrap` bindings. Keep the existing guard behavior for non-context-engine and non-bootstrap native sessions.

I have a focused regression test and patch in progress that proves:

- an 86k-token bootstrap rollout still resumes with `thread/resume`
- the following turn sends only the current user prompt, not the assembled bootstrap context again
- existing non-bootstrap native over-budget rotation tests still pass

## Validation So Far

Local validation ran from the Lexar-backed worktree:

```text
/Volumes/LEXAR/repos/worktrees/openclaw-codex-native-thread-reuse
```

Focused checks:

```text
OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.context-engine.test.ts --run
OPENCLAW_VITEST_MAX_WORKERS=1 node scripts/run-vitest.mjs extensions/codex/src/app-server/run-attempt.test.ts --run -t "starts a fresh Codex thread before resume when the native rollout is over budget|uses current rollout token usage before cumulative usage|clears native rollouts at the configured byte limit"
pnpm exec oxfmt --check --threads=1 extensions/codex/src/app-server/run-attempt.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts
git diff --check
```

Parallel review also checked Pi runtime risk. The proposed change is limited to the Codex app-server startup binding guard and should not change Pi embedded-runner compaction semantics.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Codex app-server rotates context-engine bootstrap threads after large first turns #85975

Summary

Impact

Root Cause

Expected Behavior

Proposed Fix

Validation So Far

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Codex app-server rotates context-engine bootstrap threads after large first turns #85975

Description

Summary

Impact

Root Cause

Expected Behavior

Proposed Fix

Validation So Far

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions