Skip to content

fix(compaction): guard malformed token estimation#63636

Open
GaosCode wants to merge 4 commits into
openclaw:mainfrom
GaosCode:fix/compaction-token-guard
Open

fix(compaction): guard malformed token estimation#63636
GaosCode wants to merge 4 commits into
openclaw:mainfrom
GaosCode:fix/compaction-token-guard

Conversation

@GaosCode

@GaosCode GaosCode commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: long-lived main sessions could crash before provider dispatch when compaction token estimation hit malformed replay history and estimateTokens() read missing .length fields.
  • Why it matters: once a session contained one malformed history block, every later prompt attempt could fail in pre-prompt compaction, making the session effectively unrecoverable.
  • What changed: added a guarded estimateMessageTokens() path in src/agents/compaction.ts and switched preemptive compaction plus embedded compaction metrics/sanity checks to reuse it.
  • What did NOT change (scope boundary): this PR does not redesign replay-history normalization or patch @mariozechner/pi-coding-agent; it only hardens OpenClaw’s local compaction estimation path.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause (if applicable)

  • Root cause: compaction-side token estimation assumed replayed message blocks always had fully normalized shapes, but malformed assistant/toolResult blocks could still reach estimation and trigger unchecked .length reads.
  • Missing detection / guardrail: OpenClaw had replay sanitization and some downstream try/catch sites, but no shared safe estimator for all compaction-related estimateTokens() call sites.
  • Contributing context (if known): long-lived main sessions exercise pre-prompt compaction on every turn, so one malformed history block could repeatedly fail the recovery path itself.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/compaction.test.ts, src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts
  • Scenario the test should lock in: malformed assistant/toolResult history blocks do not throw during token estimation or pre-prompt compaction checks.
  • Why this is the smallest reliable guardrail: the crash happens in pure estimation logic before provider dispatch, so unit coverage at the estimation and precheck seam is enough to lock in the failure mode.
  • Existing test that already covers this (if any): none
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Long-lived sessions with malformed replay history now fail soft in compaction token estimation instead of crashing before reply generation.

Diagram (if applicable)

Before:
[new user turn] -> [pre-prompt compaction estimation] -> [throws on malformed block] -> [session cannot reply]

After:
[new user turn] -> [guarded token estimation] -> [invalid block counted as 0/safe fallback] -> [reply flow continues]

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: local Node 22+/pnpm workspace
  • Model/provider: N/A for repro; crash occurs before provider dispatch
  • Integration/channel (if any): embedded main session
  • Relevant config (redacted): default compaction path with long-lived session history

Steps

  1. Build or replay a session history containing malformed assistant/toolResult blocks.
  2. Trigger a new turn that runs pre-prompt compaction estimation.
  3. Observe the behavior before and after the patch.

Expected

  • Token estimation tolerates malformed blocks and the session continues.

Actual

  • Before this fix, estimation could throw Cannot read properties of undefined (reading 'length') and abort the reply before provider dispatch.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: ran pnpm test src/agents/compaction.test.ts, pnpm test src/agents/pi-embedded-runner/run/preemptive-compaction.test.ts, and pnpm check.
  • Edge cases checked: malformed assistant content entries, missing assistant content arrays, malformed toolResult content during pre-prompt estimation.
  • What you did not verify: no live reproduction against a real damaged long-lived session transcript.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps:

Risks and Mitigations

  • Risk: local fallback token estimation may slightly differ from upstream estimateTokens() for malformed messages.
    • Mitigation: fallback is only used on malformed inputs where the previous behavior was to throw; valid messages still use upstream estimation first.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S labels Apr 9, 2026
@greptile-apps

greptile-apps Bot commented Apr 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces estimateMessageTokens() in src/agents/compaction.ts as a guarded wrapper around the upstream piEstimateTokens(): it tries the upstream estimator first and falls back to a null-safe character-counting path when the upstream throws on malformed assistant/toolResult blocks. All compaction call sites (compact.ts, preemptive-compaction.ts) are switched to the new wrapper so that malformed replay history no longer crashes pre-prompt estimation.

Confidence Score: 5/5

Safe to merge; the fix is well-scoped, the fallback path is correct, and tests cover the targeted failure modes.

All findings are P2. The only notable point is that the try/catch in summarizeCompactionMessages is now functionally dead after switching to the non-throwing estimateMessageTokens, leaving tokenEstimationFailed unreachable. This is a cleanup/clarity issue, not a bug.

src/agents/pi-embedded-runner/compact.ts — dead tokenEstimationFailed guard in summarizeCompactionMessages worth cleaning up.

Vulnerabilities

No security concerns identified. The toolResult.details stripping on the estimation path was already present and is preserved correctly; the new guarded path does not introduce new data exposure or execution surfaces.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/compact.ts
Line: 329-334

Comment:
**Dead catch branch after switch to `estimateMessageTokens`**

`estimateMessageTokens` now absorbs all errors internally and returns a numeric fallback, so the `catch` block here can no longer be reached. `tokenEstimationFailed` will always stay `false`, meaning `estTokens` in the returned object will always be a number (possibly `0`-padded for malformed messages) rather than `undefined`. The existing behavior of surfacing `undefined` on estimation failure is silently gone.

If the intent is to preserve the "return `undefined` when estimation is unreliable" contract for callers of `summarizeCompactionMessages`, consider either keeping one external try/catch around the guarded call or changing the return type to always be `number`. If the new always-a-number behavior is intentional, the `tokenEstimationFailed` variable and the ternary in the return can be removed.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(compaction): guard malformed token e..." | Re-trigger Greptile

Comment thread src/agents/pi-embedded-runner/compact.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c5a7008237

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated
Comment thread src/agents/compaction.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6b9362efba

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b1565bafbd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated
@GaosCode GaosCode force-pushed the fix/compaction-token-guard branch from c463a8f to 83f468b Compare April 10, 2026 05:39

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 83f468bc88

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a8de51ae7d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated
@GaosCode GaosCode force-pushed the fix/compaction-token-guard branch from 9c2e902 to aa179b2 Compare April 11, 2026 06:44

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa179b2410

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts
Comment thread src/agents/compaction.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a89bd1e0c3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5902dfce4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts
@GaosCode GaosCode force-pushed the fix/compaction-token-guard branch from d5902df to 050272b Compare April 11, 2026 07:38

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 050272b15d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 674baac000

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 63005a39d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts Outdated

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 107c939667

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/compaction.ts
@clawsweeper

clawsweeper Bot commented Apr 29, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 5, 2026, 1:02 AM ET / 05:02 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +225, Tests +259. Total +484 across 7 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Review did not complete, so no work-lane recommendation was made.
Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against e0018382eb00.

Label changes

Label changes:

  • remove P1: Current review triage priority is none.
  • remove merge-risk: 🚨 security-boundary: Current PR review selected no merge-risk labels.
  • remove merge-risk: 🚨 availability: Current PR review selected no merge-risk labels.

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

PR surface:

Source +225, Tests +259. Total +484 across 7 files.

View PR surface stats
Area Files Added Removed Net
Source 4 244 19 +225
Tests 3 260 1 +259
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 7 504 20 +484

What I checked:

  • failure reason: codex execution failed.
  • codex failure detail: Codex review failed for this PR with exit 1.
  • codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. labels May 19, 2026
@openclaw-barnacle openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 19, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Jun 4, 2026
@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 4, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. size: L triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Main session prompt crash: Cannot read properties of undefined (reading 'length') in compaction token estimation

1 participant