You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This RFC tracks a contract-first path for stabilizing the Pi/Codex agent runtime boundary.
Important scope calibration: this issue is not a commitment to do a large rewrite immediately. It is an umbrella that explains why the Phase 1 contract-test suite exists, what those tests protect, and what the smallest safe next step would be if maintainers agree to continue.
The core finding is narrow:
The AgentHarness registry/SPI is already useful and should not be redesigned casually.
The missing boundary is runtime-policy ownership: tools, auth/profile resolution, prompt overlays, schema normalization, transcript repair, delivery, fallback classification, transport params, and observability are still split across Pi runner code, Codex app-server glue, transports, tools, auth, and channels.
As Codex takes over more execution paths, it can bypass or reassemble policy that Pi previously owned implicitly.
The goal is risk control: lock behavior first, then decide whether to introduce a shared prepared-turn plan. No production runtime refactor should be required just because this issue exists.
What This Issue Is / Is Not
This issue is:
A maintainer-facing explanation for the Phase 1 contract PRs.
A map of OpenClaw-owned runtime policy that Pi and Codex should preserve consistently.
A place to record known red rows that should not be forgotten if/when runtime-plan work starts.
Preserve dynamic tool hooks in Codex mode #70965 showed the failure mode clearly: Codex mode could execute OpenClaw dynamic tools without preserving the existing before_tool_call contract because dynamic-tool ownership was implicit. The fix was correct, but the root cause was architectural.
[codex] Harden GPT-5.4 runtime paths #70743 and [codex] Add Pi/Codex harness extension seams #70772 showed the broader GPT-5.4 pattern: empty/planning-only/reasoning-only terminal outcomes, tool params, schema normalization, auth profile aliases, orphan turn repair, and follow-up delivery each needed separate fixes because policy was scattered across runner, transport, channel, plugin, and auth layers.
Adapter lifecycle concern for any Codex runtime-plan consumption
Minimal Next Decision
After the Phase 1 test-only PRs merge, maintainers can choose one of these paths:
Stop there for now.
The contracts still add value by preventing accidental Pi/Codex divergence.
Add a small AgentRuntimePlan shape/prototype PR.
This should be additive, internal, and mostly inert: define the prepared-turn object and producer tests, but do not force Pi or Codex to consume it yet.
Migrate one domain through the plan as a proof point.
Pick the highest-confidence domain, likely tools or auth/profile, and require the existing contract rows to stay green. Do not combine this with file moves or renames.
Anything beyond that should be a separate maintainer decision, not assumed by this RFC.
Candidate Follow-Ups, Not Current Commitments
These are possible later steps if the contract-first approach proves useful:
Shared AgentRuntimePlan consumption by Pi.
Shared AgentRuntimePlan consumption by Codex for OpenClaw-owned policy.
Optional internal Harness V2 adapter layer.
Runner split by ownership boundary.
Naming/observability cleanup for pi-embedded-runner / runEmbeddedPiAgent.
Optional WS session pooling/latency work.
They should land only as small reversible PRs with contract tests already in place.
Safety Rules
Do not ship hook surfaces before contract tests prove default behavior.
Do not split files before parity behavior is locked.
Do not let Codex own OpenClaw runtime policy.
Do not remove user-visible functionality during any migration.
Keep structural and behavioral changes in separate PRs.
Summary
This RFC tracks a contract-first path for stabilizing the Pi/Codex agent runtime boundary.
Important scope calibration: this issue is not a commitment to do a large rewrite immediately. It is an umbrella that explains why the Phase 1 contract-test suite exists, what those tests protect, and what the smallest safe next step would be if maintainers agree to continue.
The core finding is narrow:
AgentHarnessregistry/SPI is already useful and should not be redesigned casually.The goal is risk control: lock behavior first, then decide whether to introduce a shared prepared-turn plan. No production runtime refactor should be required just because this issue exists.
What This Issue Is / Is Not
This issue is:
This issue is not:
pi-embedded-runnernow.Current Status
Phase 1 contract-test PRs are open, test-only, and intentionally avoid production behavior changes.
before_tool_call, execution, result middleware,after_tool_call, blocks/errors, telemetry, no double wrappingopenai/*,openai-codex/*,codex-cli/*, app-server startup/resume profile forwarding, no cross-provider leakagecodex/*harness startupNO_REPLY, side-effect and block suppression, Codex terminal signal preservationNO_REPLYTODOparallel_tool_calls,openai-codex-responses, WS warmup default, provider prep compositionWhy This Is Needed
Recent evidence:
before_tool_callcontract because dynamic-tool ownership was implicit. The fix was correct, but the root cause was architectural.The safer discipline is contract-first: prove the intended behavior before moving ownership around.
Architecture Framing
Desired ownership boundary:
What Phase 1 Gives Maintainers
Phase 1 gives maintainers a low-risk review baseline before any semantic migration:
todorows instead of hidden assumptions.Known Rows To Carry Forward
These rows should not block Phase 1. They are reminders for whichever follow-up path maintainers choose.
codex/*harness startup preservingopenai-codex:*auth profilesopenai/*forced through Codex harness using OpenAI-Codex OAuthNO_REPLYsuppressionMinimal Next Decision
After the Phase 1 test-only PRs merge, maintainers can choose one of these paths:
Stop there for now.
The contracts still add value by preventing accidental Pi/Codex divergence.
Add a small
AgentRuntimePlanshape/prototype PR.This should be additive, internal, and mostly inert: define the prepared-turn object and producer tests, but do not force Pi or Codex to consume it yet.
Migrate one domain through the plan as a proof point.
Pick the highest-confidence domain, likely tools or auth/profile, and require the existing contract rows to stay green. Do not combine this with file moves or renames.
Anything beyond that should be a separate maintainer decision, not assumed by this RFC.
Candidate Follow-Ups, Not Current Commitments
These are possible later steps if the contract-first approach proves useful:
AgentRuntimePlanconsumption by Pi.AgentRuntimePlanconsumption by Codex for OpenClaw-owned policy.pi-embedded-runner/runEmbeddedPiAgent.They should land only as small reversible PRs with contract tests already in place.
Safety Rules
Acceptance Criteria For This RFC
This RFC succeeds if: