Skip to content

fix(compaction): guard malformed token estimation#2207

Open
BingqingLyu wants to merge 4 commits into
mainfrom
fork-pr-63636-fix-compaction-token-guard
Open

fix(compaction): guard malformed token estimation#2207
BingqingLyu wants to merge 4 commits into
mainfrom
fork-pr-63636-fix-compaction-token-guard

Conversation

@BingqingLyu

@BingqingLyu BingqingLyu commented Apr 28, 2026

Copy link
Copy Markdown
Owner

Summary

  • Problem: long-lived main sessions could crash before provider dispatch when compaction token estimation hit malformed replay history and estimateTokens() read missing .length fields.
  • Why it matters: once a session contained one malformed history block, every later prompt attempt could fail in pre-prompt compaction, making the session effectively unrecoverable.
  • What changed: added a guarded estimateMessageTokens() path in src/agents/compaction.ts and switched preemptive compaction plus embedded compaction metrics/sanity checks to reuse it.
  • What did NOT change (scope boundary): this PR does not redesign replay-history normalization or patch @mariozechner/pi-coding-agent; it only hardens OpenClaw’s local compaction estimation path.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: compaction-side token estimation assumed replayed message blocks always had fully normalized shapes, but malformed assistant/toolResult blocks could still reach estimation and trigger unchecked .length reads.
  • Missing detection / guardrail: OpenClaw had replay sanitization and some downstream try/catch sites, but no shared safe estimator for all compaction-related estimateTokens() call sites.
  • Contributing context (if known): long-lived main sessions exercise pre-prompt compaction on every turn, so one malformed history block could repeatedly fail the recovery path itself.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/compaction.test.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts
  • Scenario the test should lock in: malformed assistant/toolResult history blocks do not throw during token estimation or pre-prompt compaction checks.
  • Why this is the smallest reliable guardrail: the crash happens in pure estimation logic before provider dispatch, so unit coverage at the estimation and precheck seam is enough to lock in the failure mode.
  • Existing test that already covers this (if any): none
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Long-lived sessions with malformed replay history now fail soft in compaction token estimation instead of crashing before reply generation.

Diagram (if applicable)

Before:
[new user turn] -> [pre-prompt compaction estimation] -> [throws on malformed block] -> [session cannot reply]

After:
[new user turn] -> [guarded token estimation] -> [invalid block counted as 0/safe fallback] -> [reply flow continues]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local Node 22+/pnpm workspace
  • Model/provider: N/A for repro; crash occurs before provider dispatch
  • Integration/channel (if any): embedded main session
  • Relevant config (redacted): default compaction path with long-lived session history

Steps

  1. Build or replay a session history containing malformed assistant/toolResult blocks.
  2. Trigger a new turn that runs pre-prompt compaction estimation.
  3. Observe the behavior before and after the patch.

Expected

  • Token estimation tolerates malformed blocks and the session continues.

Actual

  • Before this fix, estimation could throw Cannot read properties of undefined (reading 'length') and abort the reply before provider dispatch.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: ran pnpm test src/agents/compaction.test.ts, pnpm test src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts, and pnpm check.
  • Edge cases checked: malformed assistant content entries, missing assistant content arrays, malformed toolResult content during pre-prompt estimation.
  • What you did not verify: no live reproduction against a real damaged long-lived session transcript.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: local fallback token estimation may slightly differ from upstream estimateTokens() for malformed messages.
    • Mitigation: fallback is only used on malformed inputs where the previous behavior was to throw; valid messages still use upstream estimation first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Main session prompt crash: Cannot read properties of undefined (reading 'length') in compaction token estimation

2 participants