Skip to content

fix: externalize native user image blocks#521

Merged
jalehman merged 1 commit into
mainfrom
openclaw-a56/native-image-block-externalization-current
Apr 28, 2026
Merged

fix: externalize native user image blocks#521
jalehman merged 1 commit into
mainfrom
openclaw-a56/native-image-block-externalization-current

Conversation

@jalehman

Copy link
Copy Markdown
Contributor

What

Externalizes top-level native user image blocks before the generic raw-payload fallback, preserving adjacent text as normal conversation content.

Why

Vision turns from OpenClaw can arrive as text plus { type: "image", data, mimeType }. Lossless was storing that whole content array as raw-user-payload.json, which avoided base64 leakage but hid the caption/text and image behind a generic raw payload.

Changes

  • Detect native user image blocks
  • Store images as LCM files
  • Preserve text blocks inline
  • Keep raw-payload fallback
  • Add regression coverage
  • Add patch changeset

Testing

  • npx vitest run test/engine.test.ts -t "native user image"
  • npx vitest run test/engine.test.ts -t "LcmContextEngine.ingest content extraction"
  • npm run build

@jalehman jalehman merged commit 791c591 into main Apr 28, 2026
2 checks passed
jalehman pushed a commit that referenced this pull request May 3, 2026
…gelog backfill (B/8) (#574)

* chore(packaging): manifest+peerdep drift guards, Windows install, #510 changelog backfill

Five separate sub-fixes that all touch packaging/process surface and
review together cleanly:

* test/manifest.test.ts: guard against contracts.tools / registerTool
  name drift.  PR #555 hand-listed 4 tool names in
  openclaw.plugin.json; nothing prevented the next added/renamed tool
  from re-introducing the silent-compaction-disabled failure mode.

* package.json peerDependencies: tighten @mariozechner/pi-* from "*"
  to ">=0.66 <1" so the next major silently mismatches at install
  time rather than runtime.

* package.json openclaw.compat.pluginApi: add upper bound
  (>=2026.2.17 <2026.6.0) and a `tested:` array so `openclaw plugins
  doctor` can flag known-incompatible host versions instead of leaving
  users with the silent-load behavior #555 fixed.

* .github/workflows/ci.yml: add a smoke-install job against the latest
  published openclaw, asserting plugin registers and lcm_* tools wire
  up.  Catches host-side breaks like #555 before users hit them.

* openclaw.plugin.json: declare kind: "context-engine" to
  disambiguate the Windows installer's hook-pack detector (#451).

* .changeset: retroactive entry crediting jalehman for #510, which
  was in the v0.9.2…v0.9.3 commit delta but absent from the release
  notes.

Closes #570, closes #571.  Refs #497, #501, #527, #555, #483, #451,
#384, #447, #492, #510, #511, #521.

* ci: scope smoke-latest-openclaw to bundle loadability only

The initial smoke job called plugin.register() against a stub api that
omitted the modelAuth surface (and others) the real register path
reaches into.  The throw was caught, but registeredTools stayed empty
and the assertion `registeredTools.length >= 4` then exit 1'd — a
self-inflicted failure that doesn't map to any real host-side regression.

Pivot the smoke to what we can verify without a richer fixture:

- bundle imports cleanly (catches esbuild externals / ESM resolution
  changes)
- `mod.default.register` is a callable function (catches missing
  default export + register-shape changes)

A "register against the real openclaw plugin loader" check would be
the right way to catch #555-class regressions on host-version bumps,
but that needs a richer fixture than CI can stub up cheaply.  Note in
the comment block.

Refs: #570, #555.

* fix(packaging): manifest test resilience + smoke prune + changeset wording

Address review on #574:

* test/manifest.test.ts: discover `src/tools/lcm-*-tool.ts` files
  dynamically via directory scan (was a hard-coded TOOL_FACTORY_FILES
  list).  Adding a new tool no longer requires updating this test.
  Loosen the registerTool factory matcher to accept arrow-block bodies
  (`(ctx) => { return createXTool(...) }`), arrow-expression bodies,
  and async arrows — so a refactor to a different arrow shape doesn't
  fail the drift guard spuriously.

* .github/workflows/ci.yml: add `npm prune --omit=dev` between build
  and openclaw install, so the smoke import resolves
  `@mariozechner/pi-*` from `openclaw@latest`'s transitive deps
  rather than from the pinned 0.66.1 devDependency we ship for local
  builds.  Catches install-time mismatches that would otherwise hide
  behind the pinned versions.

* .changeset/manifest-peerdep-windows-guards.md: the workflow's smoke
  job no longer claims to call `plugin.register(...)` and assert
  registerTool wiring (that path requires more host-runtime fixture
  than is reasonable to maintain — see #555 for the regression class).
  The changeset now describes the actual scope: bundle import, latest
  `openclaw` co-installed, callable `register` export.  Manifest
  drift is covered by the dedicated unit test.

* .changeset/retroactive-510-bootstrap-ingest-protections.md: explicit
  "@jalehman" credit in the body so the next changesets-rendered
  release note attributes #510 to its real author rather than to the
  PR introducing the retroactive entry.

Test count unchanged at 836.

* test(manifest): drop duplicate kind=context-engine assertion

`test/config.test.ts > "declares context-engine kind so OpenClaw core
binds the contextEngine slot on install"` already covers the
`manifest.kind === "context-engine"` invariant — and that single
assertion also satisfies the Windows-installer hook-pack-detector
concern from #451 since the `kind` field is the same discriminator
in both cases.

Replaced the duplicate manifest.test.ts assertion with a comment
pointing readers to the canonical location to avoid the maintenance
burden of updating two tests if the kind ever changes again.

Addresses review on #574.

---------

Co-authored-by: Eva <eva@100yen.org>
jalehman added a commit that referenced this pull request May 18, 2026
…oles (F/8) (#573)

* fix: generalize native-image-block externalizer to all roles

PR #521 added interceptNativeUserImageBlocks gated on role === "user".
Assistant and tool messages carrying native {type:"image", data:...}
blocks (some MCP tools return these) fell through to the generic
raw-payload externalizer and got stored as raw-{role}-payload.json
blobs with embedded base64.  Generalize the interceptor to all roles
and extend the MIME map with heic/avif/bmp so detection-miss formats
get sensible extensions.

Closes #565.  Refs #492, #521, #511.

* fix(externalization): system-role coverage + tighter raw-payload skip

Address review on #573:

* `interceptNativeImageBlocks`: also accept `role === "system"` so
  bootstrap or tool-injected system messages with native
  `{type:"image"}` blocks externalize through the image-files path
  instead of falling through to a raw-system-payload.json blob.  Add
  matching "System image" label and `system-image.<ext>` filename
  prefix.
* `interceptLargeRawPayload`: replace the loose
  `isExternalizedReferenceContent(stored.content)` skip (which
  matched any content containing the substring `LCM file: file_` —
  blocking raw-payload externalization for mixed messages with text
  + image reference + more text) with two narrower checks: an
  explicit `rawPayloadExternalized: true` flag, plus a stricter
  `isWhollyExternalizedReferenceContent` predicate that only matches
  content produced by a wholesale-replacement externalizer (large
  file / tool output / raw payload, each starting with the canonical
  reference header) or a single image-only reference.  Mixed content
  with substantial body now correctly externalizes.
* Changeset: rewrote to describe the actual pre/post behavior for
  every role, and the raw-payload-skip narrowing.
* Tests: renamed the tool-result test (the "before raw payload
  fallback" wording was misleading — tool/toolResult skips raw-payload
  externalization at the `stored.role === "tool"` early return).
  Added two regression tests: native-system-image externalization
  parity, and mixed-content (text + image-ref + long body) raw-payload
  externalization that the substring-skip used to suppress.  Bumped
  `largeFileTokenThreshold` from the artificially-low 20 to a more
  realistic 1000 in the existing image-only externalization tests so
  they exercise only the image path, not the interaction with raw-
  payload externalization (which the new mixed-content test covers
  directly).

838 tests pass (was 836 on the prior PR-5 tip, +2 new regressions).

* fix(externalization): match [Image: ...] references emitted by interceptPureBase64Image

`interceptPureBase64Image` emits `[Image: <fileName> | LCM file: ...]`
for user/system pure-base64 fast paths (label is just "Image", no
second "image" word).  The regexes in `isExternalizedImageReference`
and `isExternalizedReferenceContent`/`isWhollyExternalizedReferenceContent`
required the literal word `image:` after the label, so they only
matched `[<Role> image: ...]` shapes from `interceptNativeImageBlocks`
and silently dropped the `[Image: ...]` case.

Effect of the gap: a user/system message whose content was rewritten
to a pure-base64 `[Image: ...]` reference would no longer be
recognized as wholly externalized, so the next ingest pass would
re-externalize the same reference as raw payload — turning a clean
`[Image: file_…]` into `[LCM Raw Payload: file_…]`.

Lifted the regex to two static class fields (`IMAGE_REFERENCE_REGEX`
anchored, and `IMAGE_REFERENCE_REGEX_GLOBAL` for the substring
helper) and updated all three call sites to use them.  The new
pattern matches both shapes:
  /^\[(?:(?:User|System|Tool|Assistant) image|Image): ... LCM file: file_<id>\]$/

838 tests pass.  Addresses review on #573.

* docs: rephrase externalization comments

---------

Co-authored-by: Eva <eva@100yen.org>
Co-authored-by: Josh Lehman <josh@martian.engineering>
@github-actions github-actions Bot mentioned this pull request May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant