Skip to content

agents: add strict-agentic execution contract and revise update_plan semantics#64241

Merged
steipete merged 2 commits intoopenclaw:mainfrom
electricsheephq:fix/gpt54-strict-agentic-contract
Apr 10, 2026
Merged

agents: add strict-agentic execution contract and revise update_plan semantics#64241
steipete merged 2 commits intoopenclaw:mainfrom
electricsheephq:fix/gpt54-strict-agentic-contract

Conversation

@100yenadmin
Copy link
Copy Markdown
Contributor

@100yenadmin 100yenadmin commented Apr 10, 2026

Summary

This is PR 1 of the GPT-5.4 / Codex agentic runtime parity program tracked in #64227 and scoped by #64228.

It adds an opt-in strict-agentic execution contract for embedded Pi agents and revises update_plan so it behaves like structured progress state instead of user-visible filler. The runtime now stops treating plan-only turns as acceptable completion in strict-agentic mode and fails closed with an explicit blocked response after the retry cap.

PR 1 is intentionally GPT-5-first: in this slice, strict-agentic only activates for embedded Pi openai and openai-codex GPT-5-family runs. Unsupported providers/models keep default behavior unless tools.experimental.planTool is explicitly enabled.

What changed

  • add agents.defaults.embeddedPi.executionContract plus per-agent override support
  • resolve strict-agentic behavior through explicit-agent-aware lookup so no-session-key / hook / cron-style flows use the right agent config
  • remove default OpenAI/Codex auto-enable for update_plan
  • auto-enable update_plan only for explicit tools.experimental.planTool or supported strict-agentic GPT-5 runs
  • make update_plan non-chatty and tolerant of extra step fields
  • honor planTool: false even when strict-agentic is configured
  • treat update_plan as non-progress in plan-only retry logic
  • detect both prose plans and structured bullet plans before retrying or failing closed
  • give supported strict-agentic runs two plan-only retries before returning an explicit blocked state
  • keep the slice embedded-Pi-only rather than broadening into a provider-agnostic execution-contract framework

Why

GPT-5.4 / Codex currently stalls too easily after planning or recap-style turns. This slice makes that behavior opt-in fixable at the runtime-contract level without changing the default execution mode for every agent.

Non-goals

Builds on prior groundwork

Validation

Focused checks run:

  • CI=1 pnpm vitest run src/agents/openclaw-tools.update-plan.test.ts
  • CI=1 pnpm vitest run src/agents/tools/update-plan-tool.test.ts
  • CI=1 pnpm vitest run src/agents/pi-embedded-runner/run.incomplete-turn.test.ts
  • CI=1 pnpm vitest run src/config/zod-schema.agent-defaults.test.ts
  • CI=1 pnpm vitest run src/agents/system-prompt.test.ts
  • CI=1 pnpm vitest run src/agents/tool-catalog.test.ts
  • CI=1 pnpm vitest run src/agents/openclaw-tools.sessions.test.ts
  • CI=1 pnpm vitest run src/agents/openclaw-tools.nodes-workspace-guard.test.ts
  • CI=1 pnpm vitest run src/agents/pi-embedded-runner/system-prompt.test.ts
  • CI=1 pnpm vitest run extensions/openai/transport-policy.test.ts
  • CI=1 pnpm vitest run src/plugin-sdk/provider-tools.test.ts

Linked issues

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: M labels Apr 10, 2026
@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation gateway Gateway runtime size: L and removed size: M labels Apr 10, 2026
@100yenadmin 100yenadmin marked this pull request as ready for review April 10, 2026 11:37
Copilot AI review requested due to automatic review settings April 10, 2026 11:37
Copy link
Copy Markdown
Contributor Author

PR1 is now ready for review and the non-draft CI path has started running on 9539b56c.

I wasn’t able to use GitHub’s reviewer-request mutation from this fork context, so I’m tagging likely runtime maintainers here instead: @steipete @vincentkoc.

This slice is intentionally narrow and GPT-5-first:

  • embedded-Pi only
  • no auth classification
  • no permission truthfulness changes
  • no continuation/liveness scope

Tracked by #64227 and scoped by #64228.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 10, 2026

Greptile Summary

This PR adds an opt-in strict-agentic execution contract for embedded Pi agents running OpenAI/Codex GPT-5-family models. When enabled, the runtime detects plan-only turns (text-only responses that promise future action without actually calling tools), retries up to twice with an "act now" steering instruction, and fails closed with an explicit blocked-state payload if retries are exhausted. The update_plan tool auto-enable is also tightened so it only activates for strict-agentic GPT-5 runs or explicit tools.experimental.planTool: true, removing the previous OpenAI/Codex blanket auto-enable.

Confidence Score: 5/5

Safe to merge; the only finding is a minor P2 inconsistency in the blocked-state return.

All logic paths for the strict-agentic retry/block cycle are correct and well-tested. The GPT-5 guard, per-agent contract resolution, and update_plan auto-enable gating are consistent across all call sites. The single P2 comment (missing finalAssistantVisibleText in the blocked-state meta) is a minor cosmetic inconsistency that does not affect runtime correctness or user-facing behavior.

src/agents/pi-embedded-runner/run.ts around the blocked-state early return (line ~1562) — finalAssistantVisibleText is omitted from meta unlike every other exit path.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/run.ts
Line: 1562-1581

Comment:
**Missing `finalAssistantVisibleText` in blocked-state meta**

Every other early-exit return in this function (the retry-limit path at ~L1488, the incomplete-turn path at ~L1611, and the success path at L1645) includes `finalAssistantVisibleText` in `meta`. The blocked-state return omits it, so the planning text that caused the agent to be blocked is not preserved for callers that use this field for logging or display. The variable is already in scope (computed at L1450), so adding it here is a one-liner.

```suggestion
            return {
              payloads: [
                {
                  text: STRICT_AGENTIC_BLOCKED_TEXT,
                  isError: true,
                },
              ],
              meta: {
                durationMs: Date.now() - started,
                agentMeta,
                aborted,
                systemPromptReport: attempt.systemPromptReport,
                finalAssistantVisibleText,
              },
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "agents: catch structured plan-only turns" | Re-trigger Greptile

Comment thread src/agents/pi-embedded-runner/run.ts
@100yenadmin 100yenadmin force-pushed the fix/gpt54-strict-agentic-contract branch from 9539b56 to ee1c4b2 Compare April 10, 2026 11:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in strict-agentic embedded-Pi execution contract (scoped to OpenAI/OpenAI-Codex GPT‑5 family) and reworks update_plan to behave as a structured progress signal rather than user-visible filler, including stricter handling of plan-only turns.

Changes:

  • Introduces agents.defaults.embeddedPi.executionContract plus per-agent overrides and plumbing to resolve the correct agent in no-session-key flows.
  • Revises update_plan tool output to be non-chatty (no text content) and tolerant of extra per-step fields; updates gating so it’s not auto-enabled outside strict-agentic unless explicitly configured.
  • Extends embedded-Pi runner “planning-only” detection (including structured bullet plans), treats update_plan as non-progress for retry logic, and fail-closes with an explicit blocked state after the retry cap in strict-agentic mode.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/config/zod-schema.agent-runtime.ts Adds per-agent embeddedPi.executionContract config schema.
src/config/zod-schema.agent-defaults.ts Adds default embeddedPi.executionContract config schema.
src/config/zod-schema.agent-defaults.test.ts Tests parsing of embeddedPi.executionContract in defaults.
src/config/types.tools.ts Updates experimental planTool docs/comments to reflect new gating intent.
src/config/types.agents.ts Adds per-agent embeddedPi.executionContract override type surface.
src/config/types.agent-defaults.ts Introduces EmbeddedPiExecutionContract type and documents semantics.
src/config/schema.labels.ts Adds labels for new embeddedPi execution contract fields.
src/config/schema.help.ts Adds help text for new execution contract fields and revised planTool help.
src/config/schema.base.generated.ts Regenerates base schema to include new fields/help.
src/agents/tools/update-plan-tool.ts Makes update_plan non-chatty and tolerant of extra step fields.
src/agents/tools/update-plan-tool.test.ts Updates expectations and adds coverage for extra step fields.
src/agents/pi-tools.ts Threads modelId into tool construction for model-aware gating.
src/agents/pi-embedded-runner/run/incomplete-turn.ts Enhances planning-only detection and adds strict-agentic retry limits/blocked text.
src/agents/pi-embedded-runner/run.ts Resolves execution contract per agent, applies retry cap, surfaces blocked state on exhaustion.
src/agents/pi-embedded-runner/run.incomplete-turn.test.ts Adds coverage for strict-agentic retries/blocked state and bullet-plan detection.
src/agents/openclaw-tools.update-plan.test.ts Updates/extends tests for new update_plan gating rules and agent override resolution.
src/agents/openclaw-tools.ts Passes modelId and uses session-agent resolution that supports explicit agentId override.
src/agents/openclaw-tools.registration.ts Changes update_plan gating to depend on strict-agentic activation (or explicit flag).
src/agents/agent-scope.ts Adds execution contract resolution and strict-agentic activation helper.
docs/gateway/configuration-reference.md Updates docs for new planTool default behavior under strict-agentic.

Comment thread src/agents/pi-embedded-runner/run/incomplete-turn.ts Outdated
Comment thread src/config/types.tools.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9539b56ce4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/incomplete-turn.ts Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in embedded-Pi execution contract (strict-agentic) to prevent GPT‑5-family OpenAI/Codex runs from “finishing” on plan-only turns, and revises update_plan to behave like structured progress state rather than user-visible filler.

Changes:

  • Introduces agents.defaults.embeddedPi.executionContract and per-agent overrides, with schema/labels/help/docs updates.
  • Revises update_plan to return structured details with no chatty content, and tolerates extra per-step fields.
  • Updates embedded Pi planning-only retry logic to treat update_plan as non-progress, detect structured/bulleted plans, retry (2x in strict-agentic), and fail closed with an explicit blocked response.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/config/zod-schema.agent-runtime.ts Adds per-agent embeddedPi.executionContract to runtime agent entry schema.
src/config/zod-schema.agent-defaults.ts Adds defaults-level embeddedPi.executionContract schema.
src/config/zod-schema.agent-defaults.test.ts Validates defaults schema accepts embeddedPi.executionContract.
src/config/types.tools.ts Updates planTool docs/comments to remove OpenAI auto-enable implication.
src/config/types.agents.ts Adds per-agent embeddedPi.executionContract override typing.
src/config/types.agent-defaults.ts Defines EmbeddedPiExecutionContract type and documents semantics.
src/config/schema.labels.ts Adds labels for new embeddedPi contract fields (defaults + per-agent).
src/config/schema.help.ts Adds help text for embeddedPi contract fields and revises planTool help.
src/config/schema.base.generated.ts Updates generated base schema output for new fields/help text.
src/agents/tools/update-plan-tool.ts Makes update_plan non-chatty (empty content) and tolerant of extra step fields.
src/agents/tools/update-plan-tool.test.ts Updates expectations and adds coverage for ignoring extra per-step fields.
src/agents/pi-tools.ts Threads modelId into tool creation for model/provider-specific gating.
src/agents/pi-embedded-runner/run/incomplete-turn.ts Adds structured plan detection, update_plan non-progress handling, retry-limit helper, blocked text.
src/agents/pi-embedded-runner/run.ts Resolves execution contract per session/agentId, applies strict-agentic retry cap + blocked fail-closed response.
src/agents/pi-embedded-runner/run.incomplete-turn.test.ts Adds strict-agentic blocked-state test (explicit agentId/no sessionKey) + structured plan detection tests.
src/agents/openclaw-tools.update-plan.test.ts Updates gating tests: no default auto-enable; strict-agentic GPT‑5 auto-enable; planTool=false override; explicit agentId resolution.
src/agents/openclaw-tools.ts Passes modelId and agentId/session context into update_plan gating; uses resolveSessionAgentIds.
src/agents/openclaw-tools.registration.ts Reworks update_plan gating to depend on strict-agentic activation logic.
src/agents/agent-scope.ts Adds execution-contract resolution + strict-agentic activation predicate (provider/model scoped).
docs/gateway/configuration-reference.md Updates documentation for new planTool defaults/auto-enable behavior under strict-agentic GPT‑5.

Comment thread src/agents/pi-embedded-runner/run/incomplete-turn.ts
Comment thread src/agents/pi-embedded-runner/run/incomplete-turn.ts
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23d0a57e9a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/incomplete-turn.ts Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 310c27ab64

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/pi-embedded-runner/run/incomplete-turn.ts Outdated
Copy link
Copy Markdown
Contributor Author

CI is green on 310c27ab, and the follow-up fixes from the initial review pass are already folded in.

Scope is still intentionally narrow to #64228 only:

  • embedded-Pi strict-agentic same-turn behavior for openai / openai-codex GPT-5-family runs
  • revised update_plan semantics and gating
  • no permission, replay, or generic provider-contract expansion in this slice

This PR is ready for maintainer review.

@steipete steipete requested a review from a team as a code owner April 10, 2026 17:22
@openclaw-barnacle openclaw-barnacle Bot added channel: discord Channel integration: discord channel: imessage Channel integration: imessage channel: matrix Channel integration: matrix channel: telegram Channel integration: telegram channel: whatsapp-web Channel integration: whatsapp-web app: web-ui App: web-ui extensions: memory-core Extension: memory-core cli CLI command changes labels Apr 10, 2026
Copy link
Copy Markdown
Contributor Author

Rebuilt this branch from a clean upstream/main history and force-pushed it back over the polluted branch state. The PR is now back to the intended strict-agentic slice only, with the earlier review fixes preserved.

Current intended scope remains:

  • embedded-Pi strict-agentic same-turn behavior for openai / openai-codex GPT-5-family runs
  • revised update_plan gating and non-chatty semantics
  • no auth, permission, replay, or harness scope in this PR

This should address the dirty-branch bot warning and make the diff reviewable again.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an opt-in embedded Pi “strict-agentic” execution contract (configurable globally and per-agent) and updates the update_plan tool semantics + gating so GPT‑5 OpenAI/Codex runs can be kept moving past plan-only turns, while update_plan becomes structured progress state rather than user-visible filler.

Changes:

  • Add agents.defaults.embeddedPi.executionContract and per-agent agents.list[].embeddedPi.executionContract plumbing across config types, Zod schema, labels/help, and generated schema.
  • Revise update_plan tool behavior (no chatty text output; tolerate extra per-step fields) and adjust gating so it’s only auto-enabled for supported strict-agentic GPT‑5 OpenAI/Codex runs (unless explicitly enabled/disabled).
  • Harden embedded runner planning-only retry logic: treat update_plan as non-progress, detect structured “Plan:”/bullet plans, allow more retries under strict-agentic, then fail closed with an explicit blocked state.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/config/zod-schema.agent-runtime.ts Adds per-agent embeddedPi.executionContract validation.
src/config/zod-schema.agent-defaults.ts Adds default embeddedPi.executionContract validation.
src/config/zod-schema.agent-defaults.test.ts Tests acceptance of embeddedPi.executionContract in defaults schema.
src/config/types.tools.ts Updates experimental tool JSDoc to reflect strict-agentic auto-enable semantics.
src/config/types.agents.ts Adds per-agent embedded Pi override type surface.
src/config/types.agent-defaults.ts Introduces EmbeddedPiExecutionContract and documents contract semantics.
src/config/schema.labels.ts Adds UI/schema labels for new embedded Pi execution contract fields.
src/config/schema.help.ts Adds help text for new execution contract fields; updates planTool help text.
src/config/schema.base.generated.ts Regenerates base schema to include executionContract + updated planTool help.
src/agents/tools/update-plan-tool.ts Makes update_plan non-chatty (content: []) and tolerant of extra step fields.
src/agents/tools/update-plan-tool.test.ts Updates expectations for new tool result shape + adds extra-field tolerance test.
src/agents/pi-tools.ts Threads modelId into OpenClaw tool creation for provider/model-specific gating.
src/agents/pi-embedded-runner/run/incomplete-turn.ts Adds structured plan detection, treats update_plan as non-progress, introduces strict-agentic retry limit + blocked text.
src/agents/pi-embedded-runner/run.ts Resolves execution contract for session/explicit agentId; applies strict-agentic retry cap and blocked-state surfacing.
src/agents/pi-embedded-runner/run.incomplete-turn.test.ts Adds coverage for bullet-plan detection, update_plan non-progress handling, strict-agentic blocked state behavior.
src/agents/openclaw-tools.update-plan.test.ts Reworks gating tests: default-off, strict-agentic GPT‑5 auto-enable, explicit overrides, explicit agentId resolution.
src/agents/openclaw-tools.ts Adds modelId to tool gating inputs; resolves agent ids via resolveSessionAgentIds.
src/agents/openclaw-tools.registration.ts Replaces OpenAI-provider auto-enable with strict-agentic execution-contract-aware gating.
src/agents/agent-scope.ts Adds execution contract resolution helpers + strict-agentic activation predicate (provider/model gated).
docs/gateway/configuration-reference.md Updates docs to match new planTool auto-enable rules under strict-agentic.
Comments suppressed due to low confidence (1)

src/agents/pi-embedded-runner/run/incomplete-turn.ts:144

  • shouldApplyPlanningOnlyRetryGuard() normalizes provider casing but does not trim modelId. Because resolveHookModelSelection can override modelId without trimming, it’s possible for strict-agentic to become active (isStrictAgenticExecutionContractActive trims modelId) while planning-only retry/blocked-state logic never triggers (regex fails on leading/trailing whitespace), allowing plan-only completion in strict-agentic runs. Trim modelId (and ideally normalize similarly to isStrictAgenticExecutionContractActive) before applying the /^gpt-5/ gate so strict-agentic behavior can’t be bypassed by whitespace in hook overrides.
function shouldApplyPlanningOnlyRetryGuard(params: {
  provider?: string;
  modelId?: string;
}): boolean {
  const provider = normalizeLowercaseStringOrEmpty(params.provider);
  if (provider !== "openai" && provider !== "openai-codex") {
    return false;
  }
  return /^gpt-5(?:[.-]|$)/i.test(params.modelId ?? "");
}

Copy link
Copy Markdown
Contributor Author

Addressed the latest review pass on this branch.

Changes in this push:

  • use the resolved agentDir when marking auth profile usage inside the embedded runner
  • trim modelId before the GPT-5 strict-agentic guard regex so whitespace does not disable the retry/fail-closed path

Local validation:

  • CI=1 pnpm exec vitest run src/agents/openclaw-tools.update-plan.test.ts src/agents/tools/update-plan-tool.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/agents/system-prompt.test.ts src/agents/tool-catalog.test.ts

One note on lint:tmp:no-raw-channel-fetch: the current failure points at extensions/browser/src/browser/client-fetch.ts, which is outside this PR diff (git diff --name-only upstream/main...HEAD does not include that file). I’m treating that as shared/mainline debt unless CI shows a branch-specific regression here.

Copy link
Copy Markdown
Contributor Author

Rebased this branch onto current upstream/main and pushed refreshed head cb1453de29.

Why this mattered:

  • the branch was 17 commits behind main
  • the current checks-node-test failure was coming from stale plugin-install test expectations on the older base, not from the strict-agentic runtime slice itself

Validation on the rebased head:

  • CI=1 pnpm exec vitest run src/agents/openclaw-tools.update-plan.test.ts src/agents/tools/update-plan-tool.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/agents/system-prompt.test.ts src/agents/tool-catalog.test.ts
  • CI=1 pnpm exec vitest run src/plugins/install.test.ts

Scope is still intentionally unchanged: this is the same strict-agentic GPT-5-first embedded-Pi slice, just refreshed onto current main so CI and maintainer review are looking at a clean base.

@100yenadmin 100yenadmin force-pushed the fix/gpt54-strict-agentic-contract branch from 3a7a574 to cb1453d Compare April 10, 2026 19:11
Copy link
Copy Markdown
Contributor Author

Latest state on this slice:

  • Current head cb1453de29
  • Required checks are green, including checks-node-test
  • The stale bot findings on blocked-state metadata / resolved paths / GPT-5 model-id trimming were addressed on the rebased head

Scope stays intentionally narrow:

  • owns strict-agentic same-turn execution for embedded-Pi GPT-5-family openai / openai-codex
  • does not own auth/runtime classification, permission truthfulness, replay/liveness redesign, or parity benchmarking

At this point this should be the first merge candidate for the program.

Copy link
Copy Markdown
Contributor Author

Current head cb1453de29 is green on the required matrix, including checks-node-test and check-additional.

The earlier bot findings around blocked-state meta, resolved paths, and trimmed model-id behavior are addressed on this rebased head. Scope remains intentionally narrow: strict-agentic execution only, GPT-5-first and embedded-Pi-scoped.

This should be the first merge candidate. If the current head looks good from a maintainer perspective, I’d love formal review on this slice.

@steipete steipete force-pushed the fix/gpt54-strict-agentic-contract branch from cb1453d to cd12729 Compare April 10, 2026 21:01
@100yenadmin
Copy link
Copy Markdown
Contributor Author

100yenadmin commented Apr 10, 2026

@steipete finished these now for your review Chat GPT major upgrade fixes (make GPT 5.4 = as good as Opus 4.6 in Openclaw). cut down to 4 PR's now.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in strict-agentic embedded Pi execution contract and redefines update_plan as a quiet, structured progress tool (vs. user-visible filler), with stricter handling of plan-only turns for supported GPT‑5 OpenAI/Codex runs.

Changes:

  • Introduces agents.defaults.embeddedPi.executionContract plus per-agent overrides, and resolves the active contract via explicit-agent-aware lookup (works for no-session-key / hook / cron flows).
  • Revises update_plan semantics: no chatty text output, tolerates extra per-step fields, and is no longer auto-enabled except for strict-agentic GPT‑5 OpenAI/Codex runs (or explicit tools.experimental.planTool).
  • Updates embedded Pi runner planning-only retry logic: treats update_plan as non-progress, retries more in strict-agentic mode, and fail-closes with an explicit blocked response when retries are exhausted.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/config/zod-schema.agent-runtime.ts Adds per-agent embeddedPi.executionContract schema support.
src/config/zod-schema.agent-defaults.ts Adds defaults schema for embeddedPi.executionContract.
src/config/zod-schema.agent-defaults.test.ts Tests parsing of the new defaults execution contract.
src/config/types.tools.ts Updates docs for experimental tool gating semantics.
src/config/types.agents.ts Adds per-agent embedded Pi override typing.
src/config/types.agent-defaults.ts Defines EmbeddedPiExecutionContract and documents semantics.
src/config/schema.labels.ts Adds labels for new embedded Pi config fields.
src/config/schema.help.ts Adds help text clarifying strict-agentic vs default behavior and planTool gating.
src/config/schema.base.generated.ts Regenerates base config schema to include new fields/help/labels.
src/agents/tools/update-plan-tool.ts Makes update_plan non-chatty; accepts extra per-step fields; returns structured details.
src/agents/tools/update-plan-tool.test.ts Updates assertions for empty content + structured details.plan and extra-field tolerance.
src/agents/pi-tools.ts Threads modelId into OpenClaw tool creation for provider/model-specific gating.
src/agents/pi-embedded-runner/run/incomplete-turn.ts Enhances planning-only detection and retry semantics; adds strict-agentic retry limits and blocked text.
src/agents/pi-embedded-runner/run.ts Resolves execution contract per session/agent, applies strict-agentic retry cap, and surfaces blocked state.
src/agents/pi-embedded-runner/run.incomplete-turn.test.ts Adds coverage for strict-agentic blocked state, bullet-plan detection, and update_plan-as-non-progress behavior.
src/agents/openclaw-tools.update-plan.test.ts Updates gating tests: default-off, strict-agentic GPT‑5 auto-enable, and planTool: false override.
src/agents/openclaw-tools.ts Passes model provider/id and explicit agentId into update_plan gating and agent resolution.
src/agents/openclaw-tools.registration.ts Changes update_plan gating API to depend on strict-agentic activation (and config override).
src/agents/agent-scope.ts Adds execution contract resolution and strict-agentic activation predicate based on agent config + provider/model.
docs/gateway/configuration-reference.md Documents updated tools.experimental.planTool defaults and precedence rules.
docs/.generated/plugin-sdk-api-baseline.sha256 Updates generated baseline hashes.
docs/.generated/config-baseline.sha256 Updates generated config baseline hashes.

Comment thread src/agents/pi-embedded-runner/run/incomplete-turn.ts
@steipete steipete force-pushed the fix/gpt54-strict-agentic-contract branch from b2c0414 to 7c31b62 Compare April 10, 2026 21:55
@steipete steipete force-pushed the fix/gpt54-strict-agentic-contract branch from 7c31b62 to 03c38a0 Compare April 10, 2026 21:55
@steipete steipete merged commit e22f60f into openclaw:main Apr 10, 2026
8 checks passed
@steipete
Copy link
Copy Markdown
Contributor

Landed via rebase merge. Thanks @100yenadmin.

Commits on main:

  • 09b1117271 agents: add strict-agentic execution contract
  • e22f60faea docs: note strict-agentic execution contract

Validation before merge:

  • pnpm check
  • pnpm config:schema:check
  • pnpm config:docs:check
  • pnpm plugin-sdk:api:check
  • pnpm build
  • OPENCLAW_LOCAL_CHECK=1 pnpm test src/agents/openclaw-tools.update-plan.test.ts src/agents/pi-embedded-runner/run.incomplete-turn.test.ts src/config/zod-schema.agent-defaults.test.ts
  • PR checks green: Labeler, Workflow Sanity, Install Smoke preflight

@100yenadmin
Copy link
Copy Markdown
Contributor Author

100yenadmin commented Apr 10, 2026

Thanks @steipete @vincentkoc ! 🖤 let me know what you think of the rest of the GPT upgrades:

Merge order

PR A (#64241) - merged
PR B (#64439) -need review
PR C (#64300) -need review
PR D (#64441) -need review

PR D is the proof layer. It should not block review of runtime-correctness slices.

Completion gate
The project is only complete when all of these are true 🚀
PR A, PR B, and PR C are merged and stable
PR D is merged and produces a readable parity report plus a machine-readable verdict
GPT-5.4 no longer stalls after planning
GPT-5.4 no longer fakes progress or fake tool completion
GPT-5.4 no longer gives false /elevated full guidance
replay/liveness failures are surfaced as explicit states, not silent disappearance
the parity gate shows GPT-5.4 matches or beats Opus 4.6 on the agreed metrics

@100yenadmin
Copy link
Copy Markdown
Contributor Author

gotta crash out but gave you edit permissions on all these @steipete

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation gateway Gateway runtime size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agents: add strict-agentic execution contract and revise update_plan semantics

3 participants