Skip to content

feat(codex): run context-engine lifecycle in app-server harness#70809

Merged
steipete merged 9 commits into
openclaw:mainfrom
jalehman:lizzo-09598fd5-codex-context-engine-port
Apr 24, 2026
Merged

feat(codex): run context-engine lifecycle in app-server harness#70809
steipete merged 9 commits into
openclaw:mainfrom
jalehman:lizzo-09598fd5-codex-context-engine-port

Conversation

@jalehman

Copy link
Copy Markdown
Contributor

Summary

  • Problem: the bundled non-ACP Codex app-server harness accepted contextEngine on EmbeddedRunAttemptParams, but did not actually run the context-engine lifecycle around Codex turns or compactions.
  • Why it matters: context-engine plugins such as lossless-claw only behaved correctly on the embedded PI path; Codex harness sessions missed bootstrap, assembly, after-turn ingestion, maintenance, and engine-owned compaction semantics.
  • What changed: adds a harness-neutral context-engine lifecycle module, wires Codex run attempts through bootstrap/assemble/finalize/maintenance, projects assembled context into Codex developer instructions + prompt text, and lets engine-owned compaction be primary while preserving Codex native compaction as best-effort audit detail.
  • What did NOT change (scope boundary): this does not replace Codex's native thread model, does not change ACP behavior, and does not remove the embedded PI context-engine path.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

N/A — this is feature parity / lifecycle plumbing for the bundled Codex harness.

  • Root cause: Codex app-server harness integration predated the generalized context-engine lifecycle extraction.
  • Missing detection / guardrail: there was no Codex-specific regression coverage asserting context-engine bootstrap, assembly, after-turn ingestion, or engine-owned compaction.
  • Contributing context (if known): embedded PI owned the original lifecycle implementation, so the behavior was coupled to that runner path.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • extensions/codex/src/app-server/context-engine-projection.test.ts
    • extensions/codex/src/app-server/run-attempt.context-engine.test.ts
    • extensions/codex/src/app-server/compact.test.ts
    • src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts
    • src/agents/pi-embedded-runner/compact.hooks.test.ts
  • Scenario the test should lock in: Codex app-server turns call context-engine bootstrap/assemble/finalize/maintenance; engine-owned compaction is honored; PI lifecycle behavior remains intact through the shared helpers.
  • Why this is the smallest reliable guardrail: the bug/gap is lifecycle wiring and prompt projection, so mocked app-server/context-engine tests can assert exact calls without requiring a live Codex session.
  • Existing test that already covers this (if any): existing PI compaction hook coverage covered the embedded path, but not Codex harness lifecycle integration.
  • If no new test is added, why not: N/A — new tests are included.

User-visible / Behavior Changes

Codex app-server harness sessions can now honor configured context-engine plugins for prompt assembly, post-turn ingestion/maintenance, and engine-owned compaction. Users running context-engine plugins should see Codex behave more like the embedded PI path.

Diagram (if applicable)

Before:
Codex turn -> native thread prompt/history only -> mirror transcript

After:
Codex turn
  -> contextEngine.bootstrap
  -> contextEngine.assemble
  -> project assembled context into developer instructions + prompt
  -> Codex native thread turn
  -> mirror transcript
  -> contextEngine.afterTurn / ingestBatch / ingest
  -> contextEngine.maintain

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (Yes)
  • If any Yes, explain risk + mitigation: Codex harness now passes mirrored session history and runtime context through the already-configured context-engine plugin lifecycle, matching the existing embedded PI behavior. This is gated by the configured context-engine slot and excludes the legacy engine via isActiveHarnessContextEngine.

Repro + Verification

Environment

  • OS: macOS / Darwin
  • Runtime/container: local pnpm workspace
  • Model/provider: Codex app-server harness path, mocked in tests
  • Integration/channel (if any): N/A
  • Relevant config (redacted): context-engine plugin slot enabled in test harnesses

Steps

  1. Configure a context-engine plugin while using the bundled non-ACP Codex app-server harness.
  2. Start a Codex harness turn on an existing mirrored session.
  3. Observe context-engine lifecycle calls around prompt assembly, transcript finalization, and compaction.

Expected

  • Codex harness runs context-engine bootstrap/assemble before the turn.
  • Assembled context is included in Codex developer instructions / prompt text.
  • After the turn, mirrored transcript data is sent to afterTurn or ingest fallback and maintenance runs for successful turns.
  • Context-engine-owned compaction is primary when info.ownsCompaction === true.

Actual

  • Prior behavior skipped the context-engine lifecycle in the Codex harness path.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Validation run locally:

  • pnpm test -- --run extensions/codex/src/app-server/context-engine-projection.test.ts extensions/codex/src/app-server/run-attempt.context-engine.test.ts extensions/codex/src/app-server/compact.test.ts src/agents/pi-embedded-runner/run/attempt.spawn-workspace.context-engine.test.ts
  • pnpm test -- --run src/agents/pi-embedded-runner/compact.hooks.test.ts src/agents/harness/selection.test.ts
  • pnpm -s tsgo:test:extensions
  • pnpm -s tsgo:test:src
  • pnpm -s lint:core
  • pnpm -s lint:extensions
  • pnpm -s lint:plugins:no-extension-src-imports
  • pnpm -s lint:extensions:no-src-outside-plugin-sdk
  • pnpm -s lint:extensions:no-relative-outside-package
  • pnpm -s lint:docs

Human Verification (required)

What I personally verified (not just CI), and how:

  • Verified scenarios: reviewed the branch diff against main; ran focused Codex context-engine tests, PI compaction/harness tests, TypeScript checks, extension/core lints, plugin boundary lints, and docs lint.
  • Edge cases checked: duplicate trailing prompt projection, empty assembled history fallback, engine-owned compaction with native Codex compaction audit details, afterTurn fallback to ingestBatch, skipped maintenance on failed prompts, and PI compaction hook behavior after helper extraction.
  • What I did not verify: I have not manually tested this in a live Codex app-server session.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: projecting assembled history into a single Codex prompt could duplicate the current user prompt.
    • Mitigation: projection drops an exact duplicate trailing current user prompt and has regression coverage.
  • Risk: context-engine failures could break Codex turns.
    • Mitigation: bootstrap/assemble failures warn and fall back; post-turn failures mark finalization unsuccessful without retrying as a fresh prompt.
  • Risk: compaction ownership between Codex native compaction and context-engine compaction could be ambiguous.
    • Mitigation: ownsCompaction makes context-engine compaction primary; native Codex compaction remains best-effort and is recorded separately in result details.

@aisle-research-bot

aisle-research-bot Bot commented Apr 23, 2026

Copy link
Copy Markdown

🔒 Aisle Security Analysis

We found 5 potential security issue(s) in this PR:

# Severity Title
1 🟠 High Context-engine systemPromptAddition is appended to Codex developer instructions (prompt-injection privilege escalation)
2 🟠 High Tool result text blocks can leak sensitive tool output into Codex prompt context
3 🟡 Medium Potential DoS via unbounded context rendering before truncation in Codex context projection
4 🟡 Medium Sensitive data exposure via unredacted error object logging in context-engine assembly failure path
5 🟡 Medium Sensitive data exposure via unredacted String(error) logging in harness context-engine lifecycle
1. 🟠 Context-engine `systemPromptAddition` is appended to Codex developer instructions (prompt-injection privilege escalation)
Property Value
Severity High
CWE CWE-269
Location extensions/codex/src/app-server/run-attempt.ts:206-216

Description

The Codex harness projects ContextEngine.assemble() output into the Codex request. While the assembled message history is framed as quoted reference data in the user prompt, the context-engine-provided systemPromptAddition is instead promoted into developer instructions.

  • systemPromptAddition is produced by the active ContextEngine implementation.
  • A context engine can derive systemPromptAddition from user-controlled data (e.g., prior user messages, retrieved documents, tool outputs), intentionally or accidentally.
  • The code appends this string into developerInstructions, which are higher-privilege than user content for most LLM harnesses.
  • No trust boundary, allowlist, or sanitization is enforced before promotion.

This enables instruction-confusion / prompt-injection style privilege escalation: an attacker who can influence context-engine outputs (directly via untrusted engine selection, or indirectly via a trusted engine that incorporates user content) can inject directives that override safety policies, tool-use constraints, or operational guidance.

Vulnerable code:

const projection = projectContextEngineAssemblyForCodex({
  assembledMessages: assembled.messages,
  originalHistoryMessages: historyMessages,
  prompt: params.prompt,
  systemPromptAddition: assembled.systemPromptAddition,
});
...
developerInstructions = joinPresentSections(
  baseDeveloperInstructions,
  projection.developerInstructionAddition,
);

And the projection promotes it:

...(params.systemPromptAddition?.trim()
  ? { developerInstructionAddition: params.systemPromptAddition.trim() }
  : {}),

Recommendation

Establish and enforce a trust boundary for systemPromptAddition before promoting it into developer instructions.

Options (pick one consistent policy):

  1. Do not promote context-engine output to developer instructions. Instead, include it in the user prompt in a clearly delimited quoted block (like the rest of the context), or map it to a lower-privilege channel.

  2. Allowlist trusted engines only (e.g., built-in engines) to provide systemPromptAddition, and ignore it for third-party/untrusted engines.

  3. Constrain and sanitize: only permit a structured, machine-validated format (e.g., JSON with known keys), reject anything containing instruction-like patterns, and cap length.

Example mitigation (allowlist by engine id):

const TRUSTED_SYSTEM_PROMPT_ENGINES = new Set(["legacy", "built_in_memory"]);
const trustedAddition =
  activeContextEngine && TRUSTED_SYSTEM_PROMPT_ENGINES.has(activeContextEngine.info.id)
    ? assembled.systemPromptAddition
    : undefined;

const projection = projectContextEngineAssemblyForCodex({
  assembledMessages: assembled.messages,
  originalHistoryMessages: historyMessages,
  prompt: params.prompt,
  systemPromptAddition: trustedAddition,
});

If you must support third-party engines, consider adding a separate field like developerInstructionAdditionTrusted: boolean in the assemble result and enforce it in the host, rather than trusting engine-provided strings implicitly.

2. 🟠 Tool result text blocks can leak sensitive tool output into Codex prompt context
Property Value
Severity High
CWE CWE-200
Location extensions/codex/src/app-server/context-engine-projection.ts:76-91

Description

projectContextEngineAssemblyForCodex attempts to omit tool payloads when projecting context into a Codex prompt, but only redacts content parts with type: "toolCall"/"tool_use" and type: "toolResult"/"tool_result".

However, elsewhere in the codebase tool results are represented as role: "toolResult" messages whose content is an array of normal text blocks (not type: "toolResult"). For example, transformTransportMessages() synthesizes missing tool results with content: [{ type: "text", text: "No result provided" }].

As a result:

  • Any real tool output that is stored as role: "toolResult" + type: "text" (or content as a raw string) will be included verbatim in the assembled Codex prompt context.
  • This can disclose secrets or sensitive data returned by tools (e.g., environment variables, API keys, file contents) to the model despite the intent to omit tool payloads.

Vulnerable code (redaction only happens per-part, not per-message role):

if (typeof message.content === "string") {
  return truncateText(message.content.trim(), MAX_TEXT_PART_CHARS);
}
...
return message.content
  .map((part: unknown) => renderMessagePart(part))

Recommendation

Redact tool payloads based on message role as well as per-part type.

For example, treat any role: "toolResult" (and potentially role: "tool" / role: "function" depending on your upstream schemas) as sensitive and replace its body with a placeholder regardless of how content is encoded.

function renderMessageBody(message: AgentMessage): string {// Omit all tool results by role to prevent schema-dependent leaks
  if (message.role === "toolResult" || message.role === "tool") {
    return "[tool result omitted]";
  }

  if (!hasMessageContent(message)) return "";

  if (typeof message.content === "string") {
    return truncateText(message.content.trim(), MAX_TEXT_PART_CHARS);
  }
  if (!Array.isArray(message.content)) return "[non-text content omitted]";

  return message.content
    .map((part) => renderMessagePart(part))
    .filter((value): value is string => value.length > 0)
    .join("\n")
    .trim();
}

Additionally:

  • Consider reusing existing helpers (e.g., src/chat/tool-content.ts) to detect tool blocks beyond just toolCall/toolResult.
  • Add tests ensuring tool result messages with type:"text" and content: string are redacted.
3. 🟡 Potential DoS via unbounded context rendering before truncation in Codex context projection
Property Value
Severity Medium
CWE CWE-400
Location extensions/codex/src/app-server/context-engine-projection.ts:30-76

Description

projectContextEngineAssemblyForCodex renders all context-engine assembled messages into a single string and only afterwards truncates it to MAX_RENDERED_CONTEXT_CHARS.

  • Inputs: assembledMessages comes from contextEngine.assemble(...) and is derived from session history (user-controlled over time) and/or context-engine output (plugin-controlled).
  • The code builds a potentially very large intermediate string via .map(...).join("\n\n") in renderMessagesForCodexContext(...).
  • Truncation (truncateText(renderedContext, MAX_RENDERED_CONTEXT_CHARS)) happens after that large string is fully allocated, so a malicious/buggy context engine or extremely large session history can cause high CPU/memory pressure (OOM) before truncation is applied.

Vulnerable code path:

const renderedContext = renderMessagesForCodexContext(contextMessages);
...
truncateText(renderedContext, MAX_RENDERED_CONTEXT_CHARS)

Even though each text part is truncated to MAX_TEXT_PART_CHARS, the total number of messages/parts is not bounded here, so total rendered size can still grow without bound.

Recommendation

Apply global size limits during rendering to avoid building large intermediate strings.

Options:

  • Enforce a maximum number of messages/parts rendered.
  • Track total characters while appending and stop once the budget is hit.

Example (incremental render with early cutoff):

function renderMessagesForCodexContextBounded(messages: AgentMessage[], maxChars: number): string {
  const chunks: string[] = [];
  let used = 0;

  for (const message of messages) {
    const text = renderMessageBody(message);
    if (!text) continue;
    const chunk = `[${message.role}]\n${text}`;
    const separator = chunks.length ? "\n\n" : "";

    const remaining = maxChars - used;
    if (remaining <= 0) break;

    const toAdd = (separator + chunk).slice(0, remaining);
    chunks.push(toAdd);
    used += toAdd.length;

    if (used >= maxChars) break;
  }

  const rendered = chunks.join("");
  return rendered.length >= maxChars ? `${rendered}\n[truncated]` : rendered;
}

Additionally, consider rejecting/ignoring context-engine assemblies whose total message count or total text size exceed a safe threshold, and ensure contextEngine.assemble output is bounded by token/size limits.

4. 🟡 Sensitive data exposure via unredacted error object logging in context-engine assembly failure path
Property Value
Severity Medium
CWE CWE-532
Location extensions/codex/src/app-server/run-attempt.ts:218-221

Description

The Codex app-server attempt runner logs a raw error object when context-engine assembly fails:

  • assembleErr is logged as structured metadata ({ error: assembleErr }).
  • The logging backend writes JSON logs to disk and may render rich objects (including nested properties) to console/file logs.
  • Elsewhere in the codebase, formatErrorMessage() is used specifically to redact sensitive tokens (redactSensitiveText) before logging.

If a thrown error includes sensitive data in its message or properties (e.g., API keys, Authorization headers, prompt/transcript fragments, tool payloads), this path may persist it in logs without redaction.

Vulnerable code:

} catch (assembleErr) {
  embeddedAgentLog.warn("context engine assemble failed; using Codex baseline prompt", {
    error: assembleErr,
  });
}

Recommendation

Avoid logging raw error objects on paths that may include user/tool data. Log a redacted, message-only representation using formatErrorMessage() (which applies redactSensitiveText).

Example fix:

} catch (assembleErr) {
  embeddedAgentLog.warn("context engine assemble failed; using Codex baseline prompt", {
    error: formatErrorMessage(assembleErr),
  });
}

If stack traces are required for debugging, ensure they are redacted before logging (e.g., redactSensitiveText(err.stack ?? "")) and consider gating behind a debug-only flag.

5. 🟡 Sensitive data exposure via unredacted String(error) logging in harness context-engine lifecycle
Property Value
Severity Medium
CWE CWE-532
Location src/agents/harness/context-engine-lifecycle.ts:124-127

Description

The new harness context-engine lifecycle helpers log errors using String(err):

  • String(error) includes the full error message for Error objects (often containing request context, IDs, tool parameters, or embedded secrets).
  • Unlike formatErrorMessage() from src/infra/errors.ts, String(err) does not apply token redaction (redactSensitiveText).

This can leak credentials or private prompt/transcript content into logs when a context engine implementation throws errors that embed those values.

Vulnerable code:

} catch (bootstrapErr) {
  params.warn(`context engine bootstrap failed: ${String(bootstrapErr)}`);
}
...
} catch (afterTurnErr) {
  params.warn(`context engine afterTurn failed: ${String(afterTurnErr)}`);
}
...
} catch (ingestErr) {
  params.warn(`context engine ingest failed: ${String(ingestErr)}`);
}

Recommendation

Use the shared formatErrorMessage() helper to ensure best-effort secret redaction before logging.

Example fix:

import { formatErrorMessage } from "../../infra/errors.js";// ...
} catch (afterTurnErr) {
  postTurnFinalizationSucceeded = false;
  params.warn(`context engine afterTurn failed: ${formatErrorMessage(afterTurnErr)}`);
}

Apply the same change to bootstrapErr and ingestErr logging sites.


Analyzed PR: #70809 at commit 3e2f9f3

Last updated on: 2026-04-24T03:57:00Z

@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation agents Agent runtime and tooling extensions: codex size: XL maintainer Maintainer-authored PR labels Apr 23, 2026
@greptile-apps

greptile-apps Bot commented Apr 23, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR wires the OpenClaw context-engine lifecycle (bootstrap, assemble, finalize, maintenance) into the bundled non-ACP Codex app-server harness, bringing it to feature parity with the embedded PI path. It extracts shared helpers into a harness-neutral context-engine-lifecycle.ts module, adds a Codex-specific context projection layer, and routes engine-owned compaction as primary while preserving native Codex compaction as a best-effort audit step.

  • P1 (compact.ts): runHarnessContextEngineMaintenance is not guarded by a try-catch before compactCodexNativeThread. A maintenance failure propagates and skips native Codex compaction entirely, contrary to the "best-effort" intent stated in the plan doc and PR description.

Confidence Score: 4/5

Safe to merge after addressing the maintenance error-handling gap in compact.ts; all other paths are well-implemented.

One P1 defect: a maintenance failure in the engine-owns-compaction path prevents native Codex compaction from running, leaving the Codex thread without compaction in an edge case. Everything else — lifecycle ordering, projection logic, bootstrap gating, post-turn finalization, and re-export chain — looks correct and consistent with the PI embedded path.

extensions/codex/src/app-server/compact.ts — maintenance error handling before compactCodexNativeThread

Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/codex/src/app-server/compact.ts
Line: 38-58

Comment:
**Maintenance failure silently skips native Codex compaction**

If `runHarnessContextEngineMaintenance` throws, the `await compactCodexNativeThread(...)` call on line 58 is never reached. The design intent (and the plan doc) is that native Codex compaction is "best-effort" and should run regardless of whether context-engine maintenance succeeded — skipping it on a maintenance failure can let the internal Codex thread grow unbounded. Wrapping maintenance in a try-catch here would preserve the best-effort semantic.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/codex/src/app-server/compact.ts
Line: 59-72

Comment:
**Native compaction diagnostics lost when primary result is absent**

When `primary.result` is `undefined` (e.g. compaction ran but returned no result object, or `primary.compacted` is false), `nativeResult` is discarded and the caller has no visibility into whether native Codex compaction ran. Consider including the native result in the return even when `primary.result` is absent.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "feat: port codex harness context engine" | Re-trigger Greptile

Comment thread extensions/codex/src/app-server/compact.ts
Comment thread extensions/codex/src/app-server/compact.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6b9903355d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread extensions/codex/src/app-server/run-attempt.ts Outdated
Comment thread extensions/codex/src/app-server/compact.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 53ddfba8d9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

agentDir,
contextTokenBudget,
});
const harnessResult = await maybeCompactAgentHarnessSession({

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Dispose resolved context engine on harness compaction errors

compactEmbeddedPiSession now resolves a context engine before delegating to maybeCompactAgentHarnessSession, but this call is not protected by a try/finally. If harness selection or harness compact throws (for example, a Codex harness failure before it returns an EmbeddedPiCompactResult), execution exits before any contextEngine.dispose?.() call, leaking engine-owned resources across repeated compactions. Please ensure the resolved engine is disposed on both success and throw paths.

Useful? React with 👍 / 👎.

prompt: string;
systemPromptAddition?: string;
}): CodexContextProjection {
const prompt = params.prompt.trim();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve user prompt bytes when projecting assembled context

projectContextEngineAssemblyForCodex trims params.prompt before both deduplication and final prompt construction, which mutates user input whenever the context-engine path is active. This can change whitespace-sensitive prompts (e.g., indented code, deliberate trailing spaces, or newline-only continuation prompts) and can also trigger false duplicate removal in dropDuplicateTrailingPrompt. Keep the original prompt text for output and only normalize separately when needed for comparisons.

Useful? React with 👍 / 👎.

@steipete steipete force-pushed the lizzo-09598fd5-codex-context-engine-port branch 2 times, most recently from cc510f6 to 8dc77e7 Compare April 24, 2026 03:14
@openclaw-barnacle openclaw-barnacle Bot added the gateway Gateway runtime label Apr 24, 2026
@steipete steipete force-pushed the lizzo-09598fd5-codex-context-engine-port branch from 8dc77e7 to dd38014 Compare April 24, 2026 03:16

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8dc77e77d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +543 to +545
runMaintenance: runHarnessContextEngineMaintenance,
sessionManager,
warn: (message) => embeddedAgentLog.warn(message),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reopen session manager before turn maintenance

runCodexAppServerAttempt passes the sessionManager instance that was opened before the turn into finalizeHarnessContextEngineTurn, but transcript mirroring just before this step writes through a different SessionManager instance. If a context-engine maintenance pass uses runtimeContext.rewriteTranscriptEntries, it can rewrite from that stale pre-turn branch and move the active leaf away from the newly mirrored turn messages. Reopen the session file (or avoid passing sessionManager here) after mirroring so maintenance operates on fresh transcript state.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: da51769870

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@@ -123,7 +137,10 @@ export async function runCodexAppServerAttempt(
runId: params.runId,
},
});
const historyMessages = readMirroredSessionHistoryMessages(params.sessionFile);
const hadSessionFile = await fileExists(params.sessionFile);
const sessionManager = SessionManager.open(params.sessionFile);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard SessionManager opening before reading history

runCodexAppServerAttempt now opens SessionManager unconditionally before any context-engine checks. If sessionFile is malformed or otherwise unreadable, SessionManager.open(...) can throw here and abort the entire Codex turn, whereas the previous flow tolerated history-read failures and continued with empty history. This regresses resilience for corrupted session transcripts and can block users from sending any further turns until manual file repair.

Useful? React with 👍 / 👎.

Comment on lines 99 to 100
return enqueueCommandInLane(sessionLane, () =>
enqueueGlobal(async () => {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Dispose context engine when queue enqueue fails

The context engine is resolved before lane enqueue, but disposal is only handled in the harness-result branch or inside the queued callback. If enqueueCommandInLane(...) rejects before executing the task (for example while gateway draining rejects new work), this function exits without running the inner finally, leaking the resolved context engine instance. Repeated compaction attempts during restart/drain windows can accumulate leaked engine resources.

Useful? React with 👍 / 👎.

@steipete steipete force-pushed the lizzo-09598fd5-codex-context-engine-port branch from ef3b9b3 to 3e2f9f3 Compare April 24, 2026 03:54
@steipete steipete requested a review from a team as a code owner April 24, 2026 03:54
@openclaw-barnacle openclaw-barnacle Bot removed the gateway Gateway runtime label Apr 24, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3e2f9f34fd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +568 to +570
const finalMessages =
readMirroredSessionHistoryMessages(params.sessionFile) ??
historyMessages.concat(result.messagesSnapshot);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fall back to in-memory turn messages when mirror write fails

mirrorTranscriptBestEffort intentionally swallows transcript write errors, but finalization immediately prefers readMirroredSessionHistoryMessages(...) and only falls back to historyMessages.concat(result.messagesSnapshot) when the read itself fails. If the file is still readable after a failed mirror append, finalizeHarnessContextEngineTurn receives pre-turn history only, so afterTurn/ingestBatch can miss the current user+assistant turn and silently desynchronize context-engine memory from the actual run.

Useful? React with 👍 / 👎.

Comment on lines +64 to +68
const { model: ceModel } = await resolveModelAsync(
ceProvider,
ceModelId,
agentDir,
params.config,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer model resolution until after harness compaction short-circuit

This resolveModelAsync(...) call now runs before maybeCompactAgentHarnessSession(...), so compaction can fail on model-resolution errors before a non-PI harness gets to handle compaction. In cases like Codex harness/native thread compaction, an unknown or stale model id becomes a hard blocker even though the harness path itself does not require this lookup, which is a regression from the previous short-circuit ordering.

Useful? React with 👍 / 👎.

@steipete steipete merged commit 51186d2 into openclaw:main Apr 24, 2026
65 checks passed
@steipete

Copy link
Copy Markdown
Contributor

Landed via squash as 51186d2. Thanks @jalehman.\n\nValidation:\n- pnpm check:changed\n- pnpm check:docs\n- focused Codex/context-engine tests\n- CI green

steipete added a commit that referenced this pull request Apr 24, 2026
…@jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
Angfr95 pushed a commit to Angfr95/openclaw that referenced this pull request Apr 25, 2026
…claw#70809)

Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.

Thanks @jalehman.
Angfr95 pushed a commit to Angfr95/openclaw that referenced this pull request Apr 25, 2026
… (thanks @jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
…claw#70809)

Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.

Thanks @jalehman.
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
… (thanks @jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…claw#70809)

Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.

Thanks @jalehman.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
… (thanks @jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
globalcaos pushed a commit to globalcaos/tinkerclaw that referenced this pull request May 13, 2026
…claw#70809)

Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.

Thanks @jalehman.
globalcaos pushed a commit to globalcaos/tinkerclaw that referenced this pull request May 13, 2026
… (thanks @jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
…claw#70809)

Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.

Thanks @jalehman.
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
… (thanks @jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
…claw#70809)

Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.

Thanks @jalehman.
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
… (thanks @jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
…claw#70809)

Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.

Thanks @jalehman.
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
… (thanks @jalehman)

Co-authored-by: Josh Lehman <josh@martian.engineering>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation extensions: codex maintainer Maintainer-authored PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants