Skip to content

fix(agents): deduplicate user messages from model-fallback retries#52903

Closed
tyeth-ai-assisted wants to merge 3 commits intoopenclaw:mainfrom
tyeth-ai-assisted:fix/fallback-duplicate-user-messages
Closed

fix(agents): deduplicate user messages from model-fallback retries#52903
tyeth-ai-assisted wants to merge 3 commits intoopenclaw:mainfrom
tyeth-ai-assisted:fix/fallback-duplicate-user-messages

Conversation

@tyeth-ai-assisted
Copy link
Copy Markdown

Summary

  • Model-fallback retries duplicate user messages in session JSONL (N providers = N copies per prompt)
  • Error assistant messages get stripped during API input construction, leaving consecutive duplicate user messages
  • Context grows at N× per heartbeat cycle, causing prompt processing timeouts and token waste

Two-layer fix

  1. session-manager-init.ts: Strip trailing orphaned user messages (after last assistant) before retry — prevents duplicates from being persisted
  2. openai-ws-stream.ts: Collapse consecutive user messages in convertMessagesToInputItems — safety net for historical duplicates already in JSONL

Test plan

  • Added test: consecutive user messages from fallback retries are collapsed to one
  • Added test: user messages separated by assistant responses are preserved
  • Added test: consecutive but distinct user messages collapse to last (acceptable trade-off)
  • Manual: configure 3+ fallback providers where primary/secondary 429 — verify single user message in LLM context
  • Manual: verify heartbeat sessions don't accumulate duplicate messages over time

Root cause trace

```
runWithModelFallback (for each candidate)
→ runFallbackAttempt → runEmbeddedAttempt
→ activeSession.prompt(effectivePrompt) // writes user msg to session
→ model returns error → assistant(error) written
→ next candidate → same prompt() call → another user msg written
```

The orphan check at attempt.ts:2773 only catches consecutive user messages, but the error assistant entries between fallback attempts defeat it.

Fixes #31101, #46005
Related: #39536

/cc @tyeth

🤖 Generated with Claude Code

When the primary model fails (429/rate limit) and OpenClaw falls back
through the candidate chain, each runEmbeddedAttempt call writes a new
user message to the session JSONL via activeSession.prompt(). The error
assistant messages between them get stripped during API input
construction (empty content), leaving N consecutive copies of the same
user message — one per provider in the fallback chain.

Two-layer fix:

1. session-manager-init: Strip trailing orphaned user messages before
   retry, preventing the duplicate from being persisted.

2. openai-ws-stream: Collapse consecutive user messages in
   convertMessagesToInputItems as a safety net for historical dupes.

Fixes openclaw#31101, openclaw#46005
Related: openclaw#39536
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Mar 23, 2026
Previous version collapsed ALL consecutive user messages. Now only
deduplicates when content is identical (fingerprint match), preserving
distinct multi-message sends.
@tyeth-ai-assisted
Copy link
Copy Markdown
Author

Update: The HTTP path (openai-responses API used by LM Studio and other local providers) goes through `@mariozechner/pi-ai`'s `convertResponsesMessages`, not OpenClaw's `convertMessagesToInputItems`.

Companion PR for the HTTP path fix: earendil-works/pi#2547

Also updated the dedup logic to only collapse consecutive user messages with identical text content (fingerprint match), preserving distinct multi-message sends.

@clawsweeper
Copy link
Copy Markdown
Contributor

clawsweeper Bot commented Apr 26, 2026

Closing this as duplicate or superseded after Codex automated review.

PR #52903 should close as superseded by #63696. The newer PR is the focused, non-draft vehicle for the same fallback-retry duplicate user persistence bug, while #52903 is an older draft that also carries an unrelated fork release workflow. Current main still lacks the run-scoped suppression, so this is not an implemented-on-main close.

Best possible solution:

Close #52903 and keep the remaining implementation/review on #63696. If maintainers want the historical duplicate-input safety net from #52903, fold it into the current src/agents/openai-ws-message-conversion.ts path during #63696 review instead of reviving this older draft and its unrelated fork release workflow.

What I checked:

So I’m closing this here and keeping the remaining discussion on the canonical linked item.

Codex Review notes: model gpt-5.5, reasoning high; reviewed against 6cd047e7c270.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Telegram partial message spam loop when API rate limit / model fallback

1 participant