Codex long-running sessions should use semantic thread/bootstrap cache ownership

# Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation

## Problem

Long-running Discord/Codex sessions can still become slow after #85978 when they do not enter or retain the context-engine `thread_bootstrap` path. The local Codex app-server startup guard can still log:

```text
codex app-server native transcript exceeded active token limit; starting a fresh thread
nativeTokens=116268, max 70000
```

That means OpenClaw clears the saved native Codex thread and starts cold, even though the selected model may have much more context headroom. The 70k number is an OpenClaw native-thread reuse guard, not the model context window.

## Current Understanding

- Pi embedded runner normally reloads and injects bootstrap files every turn, then relies on stable prompt-prefix/provider-cache behavior to amortize repeated bytes.
- Codex app-server has a different efficient path: persistent native thread reuse. Context engines can return `contextProjection.mode = "thread_bootstrap"` so OpenClaw injects assembled history once for a stable epoch and then resumes the same native thread.
- Current lossless-claw main appears designed for this path: it returns `thread_bootstrap` with an epoch derived from summary context state rather than ordinary fresh-tail growth.
- #85978 fixes one bug in this path by preventing the startup size guard from deleting a still-valid bootstrapped binding before compatibility is checked. It now also keeps stale/no-active-engine bindings safe.
- The broader architecture gap remains for legacy/non-bootstrap sessions, old lossless-claw builds, session-file rollover, blunt compaction invalidation, workspace bootstrap surfaces outside the projection contract, and insufficient rotation diagnostics.

## Desired Architecture

For Codex app-server, native-thread rotation should be primarily semantic:

- rotate on `/new` or `/reset`;
- rotate on model/provider/auth/tool/MCP/app/environment incompatibility;
- rotate on context-engine policy or projection epoch/fingerprint change;
- rotate when a saved context-engine binding has no current active context engine;
- rotate when the app-server reports the native thread is gone or actually overflows;
- do not rotate solely because a compatible `thread_bootstrap` native rollout is above a hard-coded 70k guard.

The native thread should be treated as a projection cache keyed by stable session/channel identity plus context-engine conversation/projection identity, not only by `sessionFile + ".codex-app-server.json"`.

## Proposed Follow-Ups

1. Add a Codex native-thread rotation reason enum and diagnostics block.
   Log current/saved engine id, policy fingerprint, epoch/fingerprint, dynamic tools, MCP/app/environment/auth/model fingerprints, token source, native/session tokens, and whether mirrored history was projected.

2. Make the native reuse guard model/config/context-owner aware.
   Keep strict clearing for legacy or ownerless sessions, but treat compatible context-engine `thread_bootstrap` sessions as semantically owned by the context engine unless the app-server actually rejects the turn.

3. Preserve or migrate Codex bindings across LCM/session-file rollover when conversation identity and projection epoch remain compatible.

4. Add explicit workspace bootstrap fingerprints to Codex thread binding/diagnostics.
   Track stable inherited developer instructions, turn-scoped collaboration instructions, prompt context contributors, and native project-doc loading separately.

5. Revisit compaction invalidation.
   Successful context-engine-owned compaction currently clears Codex bindings. If compaction does not change projection epoch/fingerprint, native reuse may be preservable.

## Acceptance Criteria

- A long-running single-agent Codex/Discord session with stable lossless-claw `thread_bootstrap` epoch can exceed 70k native rollout tokens without cold-starting every turn.
- When a turn cold-starts, logs state exactly which semantic or runtime compatibility dimension forced it.
- If LCM compacts, rotates, or rewrites the transcript, OpenClaw either preserves the compatible Codex binding or logs the exact epoch/policy/session identity reason it could not.
- `/doctor` or equivalent status output distinguishes model/provider context overflow from OpenClaw native-thread reuse guard rotation.

## Related

- #85975
- #85978
- Architecture note: `/Volumes/LEXAR/Codex/openclaw-codex-long-session-architecture-20260524.md`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Codex long-running sessions should use semantic thread/bootstrap cache ownership #86023

Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation

Problem

Current Understanding

Desired Architecture

Proposed Follow-Ups

Acceptance Criteria

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Codex long-running sessions should use semantic thread/bootstrap cache ownership #86023

Description

Codex long-running sessions should use semantic thread/bootstrap cache ownership instead of hard native-token rotation

Problem

Current Understanding

Desired Architecture

Proposed Follow-Ups

Acceptance Criteria

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions