fix(agents): keep state.messages intact across z.ai-style provider turns in embedded runs#76056
Conversation
|
Codex review: needs changes before merge. Summary Reproducibility: yes. The linked bug gives a concrete two-turn Telegram/OpenAI-compatible gateway reproduction, and Pi 0.71.1 source confirms the pre-prompt compaction path can replace state.messages before the provider call. Next step before merge Security Review findings
Review detailsBest possible solution: Land the z.ai/GLM auto-compaction guard and post-reload reapply pattern with the focused tests after adding one active changelog Fixes bullet. Do we have a high-confidence way to reproduce the issue? Yes. The linked bug gives a concrete two-turn Telegram/OpenAI-compatible gateway reproduction, and Pi 0.71.1 source confirms the pre-prompt compaction path can replace state.messages before the provider call. Is this the best way to solve the issue? No in its current form. The runtime approach is the narrow maintainable fix for the confirmed z.ai/GLM silent-overflow path, but the required changelog entry is still missing. Full review comments:
Overall correctness: patch is incorrect Acceptance criteria:
What I checked:
Likely related people:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 1d5c77c4439d. |
8ce3cad to
0a52294
Compare
martingarramon
left a comment
There was a problem hiding this comment.
Solid root-cause trace + the applyPiAutoCompactionGuard extension reads cleanly. Tests at pi-settings.test.ts lock the deliberate broad-match on bare glm- (provider: "openai" / "openrouter" / "custom" all flag TRUE) so intent is unambiguous.
Two non-blocking questions:
-
Asymmetric call count.
compact.ts:977callsapplyPiAutoCompactionGuardonce (post-reload), butattempt.tscalls it twice (:1473pre-reload,:1509post-reload). Intentional because the compaction LLM session can't trigger Pi's_checkCompactionin the pre-reload window? If so, a one-line comment atcompact.ts:976would lock the asymmetry. -
Auth-profile drift on
compact.ts:984.baseUrl: effectiveModel.baseUrlreflects auth-profile-injected URLs (per the doc comment), butprovider/modelIdare bare scope vars. If auth profiles can rewrite those labels too, the silent-overflow class slips past — sanity-check?
Both nits, no blockers.
|
Thanks for the careful read — both worth being explicit about. Asymmetry. In both files the Pi
Auth-profile drift. Fair concern — sanity-checked end-to-end, and the signals hold. The auth-profile path can inject So |
…rns in embedded runs
b791baf to
ef305bb
Compare
|
Merged via squash.
Thanks @openperf! |
|
Merged as cc8a8f1. Changelog entry added under |
…rns in embedded runs (openclaw#76056) Merged via squash. Prepared head SHA: ef305bb Co-authored-by: openperf <80630709+openperf@users.noreply.github.com> Co-authored-by: openperf <80630709+openperf@users.noreply.github.com> Reviewed-by: @openperf
Summary
z-ai/glm-*), conversational context is lost on consecutive turns. The HTTP request body that reaches the provider contains only the system prompt and the just-submitted user message, even though the runner'sprompt.submittedtrajectory event captured the full prior transcript withmessagesLenmatching the number of stored messages. Pronouns lose their referent, and continuation requests can ship a bare tool result without its preceding assistant tool call.Session.prompt()runs_checkCompaction(lastAssistant, false)before it appends the new user message. For z.ai-style providers, pi-ai'sisContextOverflow(Case 2 — "Silent overflow z.ai style") returnstruewhenever the last successful assistant turn hasusage.input + usage.cacheRead > contextWindow, even withstopReason: "stop". That triggers Pi's_runAutoCompaction("overflow", true)which reassignsagent.state.messagesto the post-compaction view — typically a single compaction summary, sometimes empty. Whenagent.prompt(messages)then runs the agent loop, the snapshot ofstate.messagesis the now-tiny array, sostreamFnships a near-empty transcript to the provider. The runner'sprompt.submittedevent already deep-clonedstate.messagesviasafeJsonStringifybefore_checkCompactionmutated it, so the trajectory still shows the pre-compaction count — making the trajectory look correct while the actual provider call is starved of context. OpenClaw's runner already drives compaction itself (shouldPreemptivelyCompactBeforePromptand the tool-result-context-guard mid-turn precheck), so Pi's parallel auto-compaction insideSession.prompt()is redundant for the failure cases this fix targets.applyPiAutoCompactionGuardsosetCompactionEnabled(false)also fires when the active model matches z.ai-style silent-overflow accounting. Detection routes through structured channels already present in the codebase — theProviderEndpointClasstyped enum (matchingendpointClass === "zai-native"for directapi.z.ai), the existingnormalizeProviderIdnormalizer (matching thezaifamily for a config-set z.ai provider), and az-ai/oropenrouter/z-ai/model-id prefix as a documented string fallback for openrouter routing — any one signal is sufficient. The compaction-LLM call site readseffectiveModel.baseUrl(the post-applyAuthHeaderOverrideview that matches every other reference in that scope), so anapi.z.aibaseUrl injected via auth profile is still visible to the detector. The existingcontextEngineInfo.ownsCompaction === trueshort-circuit is preserved unchanged. Default-mode runs against non-z.ai providers are not touched — Pi's auto-compaction stays on for them, preserving the existing baseline. To surviveDefaultResourceLoader.reload()rehydrating settings from disk and silently re-enabling auto-compaction (the same rehydration the existingapplyPiCompactionSettingsFromConfigre-call sites already document), the guard is invoked a second time after the reload in both the embedded runner and the runner-driven compaction LLM session, mirroring the pattern already used for compaction config rehydration.src/agents/pi-settings.ts: new exportedisSilentOverflowProneModel({provider, modelId, baseUrl})helper usingresolveProviderEndpoint'sendpointClass,normalizeProviderId, and thez-ai//openrouter/z-ai/model-id prefix.shouldDisablePiAutoCompactionandapplyPiAutoCompactionGuardaccept a newsilentOverflowProneProvider?: booleanin addition to the existingcontextEngineInfo.src/agents/pi-embedded-runner/run/attempt.ts: the existingapplyPiAutoCompactionGuardcall now computessilentOverflowProneProviderfromparams.model. A secondapplyPiAutoCompactionGuardcall is added immediately afterresourceLoader.reload(), alongside the existing post-reloadapplyPiCompactionSettingsFromConfigre-call.src/agents/pi-embedded-runner/compact.ts: importsapplyPiAutoCompactionGuardandisSilentOverflowProneModel; adds a guard call after the localresourceLoader.reload(). This site previously had no guard call at all, leaving the nested compaction LLM session vulnerable to the same_checkCompactionre-entry on z.ai-style runs.src/agents/pi-settings.test.ts: nine new cases appended to the existing colocated test file. Five cases forisSilentOverflowProneModel(z-ai-prefixed model ids in both bare and qualified forms, config-setz.ai/z-aiprovider, directapi.z.aibaseUrl, anthropic/openai/google routes negative, missing fields negative) and four cases forapplyPiAutoCompactionGuard(silent-overflow-prone disables,ownsCompaction === truestill disables, default-mode + non-z.ai is left enabled, missingsetCompactionEnabledreports unsupported).pi-coding-agentandpi-aiare untouched. Their compaction and overflow-detection logic is correct in isolation; this fix only stops them from firing in parallel with OpenClaw's runner-owned compaction in the cases listed above.Session.prompt()as before.shouldPreemptivelyCompactBeforePrompt, the tool-result-context-guard mid-turn precheck, context-engine assemble, the trajectory event's snapshot timing, or any provider transport./compactentrypoint surface — its nested runner-driven compaction LLM session is protected through thecompact.tssite this PR modifies.anytypes introduced.Reproduction
LegacyContextEngine(default).provider: openrouterwithmodel: z-ai/glm-5.1via a custombaseUrlfor an in-house model gateway. Anything where the previous successful assistant turn hasusage.input + usage.cacheRead > model.contextWindowworks (z.ai-family accounting is documented as silent-overflow style in pi-ai's overflow.ts)."first off - there's a folder full of my projects at /root/code - these all live in github. They're my main focus". The agent calls a tool, observes/root/code, and replies."just note where they are - remember that. They'll be needed in future".prompt.submittedevent records the full pre-existing transcript. The agent's reply ignores the earlier/root/codecontext and asks what "they" refers to.applyPiAutoCompactionGuardsetssetCompactionEnabled(false)for z.ai-style runs both before and afterresourceLoader.reload(), Pi's_checkCompactionshort-circuits insideSession.prompt(),state.messageskeeps its prior contents, and the model receives the full transcript. Coreference resolves correctly.Targeted test:
pnpm test src/agents/pi-settings.test.ts— 25/25 pass (16 existing + 9 new), including the explicit silent-overflow-prone disable regression case for #75799 and the explicit baseline-preservation case for default-mode + non-z.ai providers.Risk / Mitigation
_runAutoCompactionand OpenClaw'sshouldPreemptivelyCompactBeforePrompt). After this PR only OpenClaw's path runs for z.ai.isSilentOverflowProneModeluses three signals (typedendpointClassenum, normalized provider id, model-id prefix). A future provider rename or an unusual proxy that re-exposes z.ai under a different model-id namespace could miss detection.effectiveModel.baseUrl.applyPiAutoCompactionGuardever gains expensive side effects, those sites will execute twice per run.applyPiAutoCompactionGuardis a small decision plus at most onesetCompactionEnabled(false)call; that write is idempotent (sets the same field to the same value when already disabled). The twoapplyPiCompactionSettingsFromConfigcalls already follow the same "before + after reload" pattern at the same files.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Fixes #75799