Skip to content

Isolate Codex app-server state per agent#74556

Merged
pashpashpash merged 12 commits intomainfrom
codex/agent-codex-home-isolation
Apr 30, 2026
Merged

Isolate Codex app-server state per agent#74556
pashpashpash merged 12 commits intomainfrom
codex/agent-codex-home-isolation

Conversation

@pashpashpash
Copy link
Copy Markdown
Contributor

@pashpashpash pashpashpash commented Apr 29, 2026

Codex app-server mode was still inheriting the gateway process' Codex state unless an operator manually set environment overrides. That meant an OpenClaw agent could start a native Codex thread with the maintainer's personal Codex skills, plugins, account, config, and thread state. It also explains how Student found Pash-only Codex skills: Codex's own loader reads both $CODEX_HOME/skills and $HOME/.agents/skills.

This makes the local stdio app-server path set both CODEX_HOME and HOME to agent-owned directories under the OpenClaw agent state. WebSocket app-server connections are still untouched because OpenClaw does not own that external process. Explicit programmatic overrides still work, but appServer.clearEnv cannot erase the managed isolation variables during local launches. The OpenClaw auth bridge still applies Codex auth profiles into the app-server, so auth remains OpenClaw-owned without copying personal ~/.codex state.

The PR also adds the production migration path for users who intentionally want to bring personal Codex CLI assets into an OpenClaw agent. openclaw migrate codex --dry-run inventories Codex CLI skills, native plugins, config, and hooks. Applying the migration copies skills into the current OpenClaw agent workspace. Codex native plugins, hooks, and config are kept manual-review/report-only because they can execute commands, expose MCP servers, or carry credentials.

Doctor now warns when Codex-mode setups have personal Codex CLI assets that will not load implicitly under the isolated app-server home. The Codex harness, migrate, doctor, and skills docs now describe the boundary clearly: OpenClaw plugins and OpenClaw skill snapshots flow through OpenClaw; native Codex home state is per-agent unless deliberately promoted.

While babysitting CI for the production-ready bundle, OpenGrep also caught two external channel system events that needed explicit trust boundaries. The branch now marks the affected Signal and Slack inbound system events as untrusted and adds targeted coverage so channel-originated text cannot be treated as trusted host context.

@pashpashpash pashpashpash requested a review from a team as a code owner April 29, 2026 18:56
@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime extensions: codex size: S maintainer Maintainer-authored PR labels Apr 29, 2026
@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 29, 2026

Codex review: needs maintainer review before merge.

What this changes:

The PR isolates local Codex stdio app-server CODEX_HOME and HOME per agent, adds Codex CLI migration, doctor, docs, and auth-profile forwarding coverage, and adds Slack/Signal untrusted system-event tests.

Maintainer follow-up before merge:

Protected maintainer label, XL scope, and security-sensitive Codex auth/state isolation make this a maintainer/security review path; no narrow autonomous repair is appropriate from this review.

Security review:

Security review cleared: No concrete security or supply-chain regression was found; the diff narrows inherited Codex state and keeps executable native Codex plugins, hooks, and config out of automatic activation.

Review details

Best possible solution:

Land the stdio-only Codex isolation and deliberate migration path after explicit maintainer/security review, keeping WebSocket app-server ownership external and keeping executable native Codex plugins, hooks, and config manual-review-only.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection on current main gives a high-confidence path: local stdio Codex app-server launches do not set CODEX_HOME or HOME, and the stdio transport copies process.env into the child environment.

Is this the best way to solve the issue?

Yes. The proposed direction is the narrow maintainable fix for OpenClaw-owned stdio launches while leaving external WebSocket processes alone; the remaining decision is maintainer/security acceptance of the XL implementation, not a smaller automated repair.

Acceptance criteria:

  • pnpm test extensions/codex/src/app-server/auth-bridge.test.ts extensions/codex/src/migration/provider.test.ts src/commands/doctor/shared/codex-native-assets.test.ts src/commands/migrate.test.ts src/commands/migrate/selection.test.ts extensions/slack/src/monitor/message-handler/prepare.test.ts extensions/signal/src/monitor.tool-result.sends-tool-summaries-responseprefix.test.ts
  • pnpm test src/agents/command/attempt-execution.cli.test.ts src/agents/pi-embedded-runner/run.overflow-compaction.test.ts
  • pnpm check:changed in Testbox before handoff if maintainers proceed with the PR

What I checked:

  • Protected label: The supplied PR context lists the protected maintainer label, plus size: XL and security-adjacent Codex, gateway, agents, Slack, Signal, and migration labels, so repository policy requires explicit maintainer handling rather than automated closure or repair. (f1b3886c9f2b)
  • Current main still inherits Codex app-server state: On current main, bridgeCodexAppServerStartOptions only clears inherited OpenAI API key variables for stdio launches; it does not assign CODEX_HOME or HOME before returning start options. (extensions/codex/src/app-server/auth-bridge.ts:24, 581fbea1d653)
  • Stdio transport copies gateway environment: The stdio transport builds the child environment by copying process.env before applying startOptions.env and clearEnv, which is the concrete inheritance path this PR targets. (extensions/codex/src/app-server/transport-stdio.ts:46, 581fbea1d653)
  • Current docs describe inherited stdio environment: Current main documentation still says stdio app-server launches inherit OpenClaw's process environment by default and can use the app-server's existing local Codex CLI sign-in. Public docs: docs/plugins/codex-harness.md. (docs/plugins/codex-harness.md:513, 581fbea1d653)
  • PR isolates local Codex homes: The PR head adds agent-owned codex-home and nested home directories, injects CODEX_HOME and HOME for stdio launches, creates those directories, and prevents clearEnv from removing those managed isolation variables. (extensions/codex/src/app-server/auth-bridge.ts:36, f1b3886c9f2b)
  • Migration keeps executable Codex state explicit: The PR's Codex migration plan copies skill directories but reports native Codex plugins as manual items and archives config/hooks for review instead of auto-activating them. (extensions/codex/src/migration/plan.ts:70, f1b3886c9f2b)

Likely related people:

  • pashpashpash: The supplied prior ClawSweeper review ties this contributor to earlier merged Codex app-server auth bridge, stdio env handling, API-key clearing, and auth handoff work beyond this PR. (role: introduced behavior / adjacent owner; confidence: high; commits: a412603bad53, 401ae38f13a3, 20ff49f7c82d; files: extensions/codex/src/app-server/auth-bridge.ts)
  • steipete: Local history and supplied review context tie this maintainer to recent channel trust-boundary work and broader Codex/runtime security-adjacent maintenance touched by this PR. (role: recent maintainer / security-adjacent owner; confidence: high; commits: b743506549d6, e6cd90e3fd9c, 470098bd26f3; files: extensions/signal/src/monitor/event-handler.ts, extensions/slack/src/monitor/message-handler/prepare.ts, extensions/codex/src/app-server/auth-bridge.ts)
  • keshavbotagent: Current-main history includes a recent OpenAI Codex OAuth transport fix, which is adjacent to the Codex auth/profile forwarding surface this PR adjusts. (role: recent adjacent owner; confidence: medium; commits: 388019f5b693; files: extensions/openai/openai-codex-auth-identity.ts, extensions/openai/openai-codex-provider.ts, src/agents/openai-transport-stream.ts)

Remaining risk / open question:

  • The branch is XL and security-sensitive, so maintainer/secops review should verify the Codex migration, doctor warning behavior, auth-profile forwarding, and final CI after rebase.
  • Current main already contains adjacent channel trust and Codex OAuth changes, so merge review should check for drift against 581fbea before landing.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 581fbea1d653.

@pashpashpash pashpashpash force-pushed the codex/agent-codex-home-isolation branch 4 times, most recently from 113e9dc to 234d530 Compare April 29, 2026 21:32
@pashpashpash pashpashpash changed the title Isolate Codex app-server homes per agent Isolate Codex app-server state per agent Apr 29, 2026
@openclaw-barnacle openclaw-barnacle Bot added channel: signal Channel integration: signal channel: slack Channel integration: slack labels Apr 29, 2026
@pashpashpash pashpashpash force-pushed the codex/agent-codex-home-isolation branch 2 times, most recently from 7c6a5cd to 4ebdf56 Compare April 30, 2026 15:17
@pashpashpash
Copy link
Copy Markdown
Contributor Author

Addressed the ClawSweeper P2 in c539aac: doctor now scans HOME/.agents/skills alongside CODEX_HOME/skills, dedupes discovered hits, and the doctor test covers that personal AgentSkills root. Targeted validation passed with the requested Codex app-server, migration, doctor, Slack, and Signal tests.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling cli CLI command changes labels Apr 30, 2026
@pashpashpash pashpashpash force-pushed the codex/agent-codex-home-isolation branch from dfa61e4 to 92fc188 Compare April 30, 2026 19:48
@pashpashpash pashpashpash merged commit 027ea5f into main Apr 30, 2026
90 checks passed
@pashpashpash pashpashpash deleted the codex/agent-codex-home-isolation branch April 30, 2026 19:49
lxe pushed a commit to lxe/openclaw that referenced this pull request May 6, 2026
* fix(codex): isolate app-server home per agent

* fix(codex): isolate native Codex assets per agent

* fix(channels): mark inbound system events untrusted

* fix(doctor): warn on personal Codex agent skills

* test(doctor): cover personal Codex agent skills warning

* fix(codex): forward auth profiles to harness runs

* fix(codex): preserve auto auth for harness runs

* fix(codex): auto-select harness auth profiles

* test(codex): type harness auth mock

* feat(codex): select migrated skills

* fix(codex): satisfy migration selection lint

* docs: add codex isolation changelog
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
* fix(codex): isolate app-server home per agent

* fix(codex): isolate native Codex assets per agent

* fix(channels): mark inbound system events untrusted

* fix(doctor): warn on personal Codex agent skills

* test(doctor): cover personal Codex agent skills warning

* fix(codex): forward auth profiles to harness runs

* fix(codex): preserve auto auth for harness runs

* fix(codex): auto-select harness auth profiles

* test(codex): type harness auth mock

* feat(codex): select migrated skills

* fix(codex): satisfy migration selection lint

* docs: add codex isolation changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling channel: signal Channel integration: signal channel: slack Channel integration: slack cli CLI command changes commands Command implementations docs Improvements or additions to documentation extensions: codex gateway Gateway runtime maintainer Maintainer-authored PR plugin: migrate-claude plugin: migrate-hermes size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant