Skip to content

fix: prevent Bedrock empty-content rejection from dedup gaps and thinking-only messages#506

Merged
jalehman merged 2 commits into
Martian-Engineering:mainfrom
wujiaming88:fix/bedrock-empty-content-dedup-and-thinking-filter
Apr 28, 2026
Merged

fix: prevent Bedrock empty-content rejection from dedup gaps and thinking-only messages#506
jalehman merged 2 commits into
Martian-Engineering:mainfrom
wujiaming88:fix/bedrock-empty-content-dedup-and-thinking-filter

Conversation

@wujiaming88

Copy link
Copy Markdown
Contributor

Problem

When using Bedrock models (e.g. Claude Opus 4), lossless-claw triggers content field is empty API errors from two independent root causes:

1. deduplicateAfterTurnBatch returns full batch when storedCount > batchLen

After compaction reduces the session transcript, the stored message count can exceed the incoming batch length. The current code at this branch:

if (storedMessageCount === 0 || storedMessageCount > batch.length) {
  return batch;  // ← returns full batch, causing duplicate ingestion
}

This causes previously-stored messages to be re-ingested, leading to duplicate entries that can produce empty content arrays after dedup reconciliation downstream.

2. Thinking-only assistant messages pass the empty-content filter

Assistant messages containing only thinking/redacted_thinking/reasoning blocks have content.length > 0, so they pass the existing empty-content filter in the assembler. However, the provider layer strips all thinking blocks before sending to the API, leaving an empty content array — which Bedrock rejects.

Fix

deduplicateAfterTurnBatch (engine.ts)

  • When storedCount > batchLen: attempt tail-matching (entire batch already stored → return []) and suffix-matching (partial overlap → return only new messages) before falling back to full ingestion
  • When prefix alignment fails: attempt suffix fallback instead of immediately returning the full batch
  • Added structured log messages for each dedup path ([lcm] dedup: ...)

Thinking-only filter (assembler.ts)

  • Added isThinkingOnlyContent() helper that detects content arrays where every block is a thinking-like type (thinking, redacted_thinking, reasoning)
  • Extended the cleanedEntries filter to strip these messages alongside empty-content messages

Testing

  • Production hotpatch running since 2026-04-26 on openclaw@2026.4.24 with Bedrock Claude Opus 4 — zero recurrence of the content field is empty error after patching
  • Build passes (npm run build)
  • No new TypeScript errors introduced (same error count as main)

Related

…king-only messages

Two fixes for the Bedrock 'content field is empty' error:

1. **deduplicateAfterTurnBatch tail/suffix matching** (engine.ts)
   When storedCount > batchLen (e.g. after compaction), the old code
   returned the full batch unchanged, causing duplicate ingestion. The
   new code attempts tail-matching (entire batch already stored) and
   suffix-matching (partial overlap) before falling back to full
   ingestion. Also adds suffix fallback for the existing prefix-mismatch
   path.

2. **Strip thinking-only assistant messages** (assembler.ts)
   Assistant messages containing only thinking/redacted_thinking/reasoning
   blocks pass the existing empty-content filter (content.length > 0) but
   become empty after the provider layer strips thinking blocks. Add
   isThinkingOnlyContent() to detect and filter these before they reach
   Bedrock.

Closes: relates to PR Martian-Engineering#498 (previously closed)
Tested: production hotpatch running since 2026-04-26 on openclaw@2026.4.24
with Bedrock Claude Opus 4 — zero recurrence of the error.
@jalehman jalehman self-assigned this Apr 28, 2026
@jalehman jalehman merged commit 2f7b917 into Martian-Engineering:main Apr 28, 2026
1 check passed
@github-actions github-actions Bot mentioned this pull request Apr 28, 2026
jalehman added a commit that referenced this pull request May 3, 2026
)

* fix: hotfix v0.9.3 prefill safety + provider redirects

Three regressions introduced inside v0.9.3 itself, each a self-contained
hunk:

* assemble: restore PR #504 reference-inequality contract.  The
  no-user-turn fallback added by PR #502 returned `params.messages` by
  reference, defeating the `installContextEngineLoopHook` identity check
  installed by PR #504 (`assembled.messages !== sourceMessages`).  The
  hook then fell through to raw sourceMessages (still ending with
  assistant), re-introducing the prefill-rejection bug fixed by
  safeFallback in the other early-return paths.  Use safeFallback() here
  too.  Closes #559 (sub-fix A1).

* assembler: strip assistant messages whose only blocks are blank text
  (`[{type:"text", text:""}]`).  PR #506 added isThinkingOnlyContent for
  the Bedrock empty-content case, but blank-text blocks pass that filter
  and Bedrock still rejects with `The text field in the ContentBlock
  object at messages.N.content.0 is blank`.  New isBlankContent helper
  mirrors the thinking-only shape and is added to the cleanedEntries
  filter.  Closes #559 (sub-fix B5).

* plugin: respect explicit `https://api.openai.com/v1` for openai-codex.
  PR #549's OPENAI_CODEX_NATIVE_BASE_URLS would rewrite an
  explicitly-configured api.openai.com/v1 to chatgpt.com/backend-api/codex
  whenever api was openai-codex-responses.  That broke paid OpenAI
  API-key users who chose the endpoint deliberately — their key cannot
  authenticate against the ChatGPT backend.
  shouldUseNativeCodexBaseUrl now accepts an isExplicitlyConfigured
  flag (set when configuredBaseUrl is non-undefined) and skips the
  rewrite for that case; the native rewrite still applies when baseUrl
  is empty or already a ChatGPT Codex variant.  Closes #560 (sub-fix
  A2).

* plugin: remove silent ollama localhost fallback.  PR #546's
  `inferBaseUrlFromProvider['ollama']` mapping silently routed cloud
  ollama configs (`https://ollama.com`, per #480) to localhost.  Drop
  the entry so any ollama user must configure an explicit baseUrl;
  failing with an empty baseUrl is loudly diagnosable, while a wrong
  localhost connect is not.  Closes #560 (sub-fix A3).

Sub-fix A4 (#561's bootstrap raw-hash anchor) was deferred from this
PR after a deeper audit: the original recommendation conflicts with
the heartbeat-prune append-only fast path because the same
\`lastProcessedEntryHash\` field serves both DB-tail integrity (post-
prune, post-externalization) and raw-transcript checkpoint roles.  Will
be addressed alongside the bootstrap dedup migration in a follow-up.

Tests updated for each sub-fix:
- engine.test.ts: assembler regression test for blank-text blocks;
  existing no-user-turn fallback test rewritten to assert
  reference-inequality.
- index-complete-provider-config.test.ts: new test for explicit
  api.openai.com/v1 preservation; existing ollama test updated and
  paired with a new cloud-ollama explicit-baseUrl test.

Refs #554, #496, #499, #505, #509, #519, #480, #436, #508, #541, #251.

* fix: address pr 572 review findings

* docs(plugin): correct ollama baseUrl-fallback comment

The previous comment claimed "forces an explicit baseUrl" but
`resolveProviderModelBaseUrl` still passes `""` through to the
dispatcher when no other source yields a value.  Reword to reflect
actual behavior: returning undefined drops the inferred default and
the empty-baseUrl error is the deliberate clearer-failure path,
not a configuration enforcement.

Addresses #572 (review).

---------

Co-authored-by: Eva <eva@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants