Skip to content

[Bug]: History compaction strips all user/assistant turns on Qwen, leaving messages[] with only system → upstream 400 #86145

@dcstrange

Description

@dcstrange

Summary

When OpenClaw's history compaction runs on long conversations with Qwen models (Qwen-Long, Qwen-3-Coder-Plus), the outbound messages[] sometimes ends up with no role: user or role: assistant entries — only role: system (or only role: tool) survives.

Qwen upstream rejects with HTTP 400:

  • Role must be user or assistant and Content length must be greater than 0
  • Role specification invalid

The user-facing symptom (in deployments that route through a proxy that wraps 4xx) is a generic stream-failed message that hides the real error. Without proxy wrapping the operator sees the 400 directly, but the request is always invalid.

Environment

  • OpenClaw: 2026.4.2 (planning to upgrade to 2026.5.22)
  • Models affected: Qwen-Long (10M context), Qwen-3-Coder-Plus (1M context)
  • Provider type: openai-completions compatible
  • Trigger threshold: ~20K tokens of history before LCM compaction fires

Repro pattern

  1. Start a session with Qwen-Long or Qwen-3-Coder-Plus
  2. Run 15-20+ turns with mixed tool use + direct chat to push past the softThreshold
  3. Wait for LCM compaction
  4. The next user turn → outbound request has messages: [ { role: 'system', ... } ] (and occasionally a trailing role: tool block), with every prior user/assistant turn gone
  5. Qwen 400 → request fails

Why this looks distinct from already-tracked compaction issues

Existing issue What it covers Why this is different
#32759 (closed) "no conversation messages at all" guard, returns 400 vs graceful skip This produces a message array that has some entries, but none of them user/assistant. The guard at #32759 doesn't fire.
#69756 (closed) "chat.history omitted: message too large" replacement That replacement preserves placeholder text; our case strips entries entirely.
#74073 (open) auto-compaction race for readSubagentOutput Race-condition focus, single-call; our case is the steady-state result of compaction summarization.

Suggested fix direction

After compaction transforms messages[], the serializer should validate that at least one entry has role: 'user' or role: 'assistant' before transmitting. If not, either:

  • (a) Re-run compaction with a less-aggressive ratio, preserving the most recent user/assistant pair verbatim, or
  • (b) Fall back to a user-visible "history exhausted, please start a new session" error rather than letting upstream return an opaque 400

Option (a) is closer to graceful UX, option (b) is a clearer signal to the operator.

Offered

Happy to provide sanitized session jsonls + outbound request bodies if useful for a regression test. We hit this enough times in production that we can produce captures on demand.

Metadata

Metadata

Assignees

Labels

P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions