Isolate Codex app-server state per agent#74556
Conversation
|
Codex review: needs maintainer review before merge. What this changes: The PR isolates local Codex stdio app-server CODEX_HOME and HOME per agent, adds Codex CLI migration, doctor, docs, and auth-profile forwarding coverage, and adds Slack/Signal untrusted system-event tests. Maintainer follow-up before merge: Protected maintainer label, XL scope, and security-sensitive Codex auth/state isolation make this a maintainer/security review path; no narrow autonomous repair is appropriate from this review. Security review: Security review cleared: No concrete security or supply-chain regression was found; the diff narrows inherited Codex state and keeps executable native Codex plugins, hooks, and config out of automatic activation. Review detailsBest possible solution: Land the stdio-only Codex isolation and deliberate migration path after explicit maintainer/security review, keeping WebSocket app-server ownership external and keeping executable native Codex plugins, hooks, and config manual-review-only. Do we have a high-confidence way to reproduce the issue? Yes. Source inspection on current main gives a high-confidence path: local stdio Codex app-server launches do not set CODEX_HOME or HOME, and the stdio transport copies process.env into the child environment. Is this the best way to solve the issue? Yes. The proposed direction is the narrow maintainable fix for OpenClaw-owned stdio launches while leaving external WebSocket processes alone; the remaining decision is maintainer/security acceptance of the XL implementation, not a smaller automated repair. Acceptance criteria:
What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 581fbea1d653. |
113e9dc to
234d530
Compare
7c6a5cd to
4ebdf56
Compare
|
Addressed the ClawSweeper P2 in c539aac: doctor now scans HOME/.agents/skills alongside CODEX_HOME/skills, dedupes discovered hits, and the doctor test covers that personal AgentSkills root. Targeted validation passed with the requested Codex app-server, migration, doctor, Slack, and Signal tests. |
dfa61e4 to
92fc188
Compare
* fix(codex): isolate app-server home per agent * fix(codex): isolate native Codex assets per agent * fix(channels): mark inbound system events untrusted * fix(doctor): warn on personal Codex agent skills * test(doctor): cover personal Codex agent skills warning * fix(codex): forward auth profiles to harness runs * fix(codex): preserve auto auth for harness runs * fix(codex): auto-select harness auth profiles * test(codex): type harness auth mock * feat(codex): select migrated skills * fix(codex): satisfy migration selection lint * docs: add codex isolation changelog
* fix(codex): isolate app-server home per agent * fix(codex): isolate native Codex assets per agent * fix(channels): mark inbound system events untrusted * fix(doctor): warn on personal Codex agent skills * test(doctor): cover personal Codex agent skills warning * fix(codex): forward auth profiles to harness runs * fix(codex): preserve auto auth for harness runs * fix(codex): auto-select harness auth profiles * test(codex): type harness auth mock * feat(codex): select migrated skills * fix(codex): satisfy migration selection lint * docs: add codex isolation changelog
Codex app-server mode was still inheriting the gateway process' Codex state unless an operator manually set environment overrides. That meant an OpenClaw agent could start a native Codex thread with the maintainer's personal Codex skills, plugins, account, config, and thread state. It also explains how Student found Pash-only Codex skills: Codex's own loader reads both
$CODEX_HOME/skillsand$HOME/.agents/skills.This makes the local stdio app-server path set both
CODEX_HOMEandHOMEto agent-owned directories under the OpenClaw agent state. WebSocket app-server connections are still untouched because OpenClaw does not own that external process. Explicit programmatic overrides still work, butappServer.clearEnvcannot erase the managed isolation variables during local launches. The OpenClaw auth bridge still applies Codex auth profiles into the app-server, so auth remains OpenClaw-owned without copying personal~/.codexstate.The PR also adds the production migration path for users who intentionally want to bring personal Codex CLI assets into an OpenClaw agent.
openclaw migrate codex --dry-runinventories Codex CLI skills, native plugins, config, and hooks. Applying the migration copies skills into the current OpenClaw agent workspace. Codex native plugins, hooks, and config are kept manual-review/report-only because they can execute commands, expose MCP servers, or carry credentials.Doctor now warns when Codex-mode setups have personal Codex CLI assets that will not load implicitly under the isolated app-server home. The Codex harness, migrate, doctor, and skills docs now describe the boundary clearly: OpenClaw plugins and OpenClaw skill snapshots flow through OpenClaw; native Codex home state is per-agent unless deliberately promoted.
While babysitting CI for the production-ready bundle, OpenGrep also caught two external channel system events that needed explicit trust boundaries. The branch now marks the affected Signal and Slack inbound system events as untrusted and adds targeted coverage so channel-originated text cannot be treated as trusted host context.