Skip to content

fix(honcho): flatten multimodal list content in sync_turn (#30252)#30294

Open
Bartok9 wants to merge 1 commit into
NousResearch:mainfrom
Bartok9:fix/30252-honcho-sync-turn-multimodal-content
Open

fix(honcho): flatten multimodal list content in sync_turn (#30252)#30294
Bartok9 wants to merge 1 commit into
NousResearch:mainfrom
Bartok9:fix/30252-honcho-sync-turn-multimodal-content

Conversation

@Bartok9

@Bartok9 Bartok9 commented May 22, 2026

Copy link
Copy Markdown
Contributor

Problem

HonchoMemoryProvider.sync_turn errors when called with OpenAI-style multimodal list content (the shape used for vision turns):

user_content = [
    {"type": "text", "text": "what colour is this?"},
    {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}},
]

The call sanitize_context(user_content or "") receives a list, raising:

TypeError: expected string or bytes-like object, got 'list'

The exception is caught silently by the _sync() wrapper and the turn is silently dropped from Honcho's memory — the user representation never learns about anything visual the assistant was shown.

Closes #30252.

Fix

Add a _flatten_content() static method that normalises list-shaped content to a plain string before sanitisation:

  • Plain strings pass through unchanged (zero regression risk)
  • text parts are joined with a space
  • image_url / image parts become the literal placeholder [image]
  • Unknown block types fall back to [<type>]

sync_turn now accepts str | list | None for both arguments.

Tests

5 new tests in TestSyncTurnMultimodalContent:

  • multimodal user content is recorded (text + image placeholder)
  • image-only list uses [image] placeholder
  • plain string passes through unchanged
  • None content does not raise
  • _flatten_content static method handles all known block types

All 120 existing honcho plugin tests pass.

@alt-glitch alt-glitch added type/bug Something isn't working comp/plugins Plugin system and bundled plugins tool/memory Memory tool and memory providers P3 Low — cosmetic, nice to have labels May 22, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Competing fix with #30264, #22768, and #26484 — all add multimodal list content flattening to Honcho sync_turn. This PR adds _flatten_content() static method; #30264 and #22768 add equivalent _content_to_text() normalizers. #26484 also includes conclude fallback. Recommend consolidating into a single PR.

@Bartok9

Bartok9 commented May 27, 2026

Copy link
Copy Markdown
Contributor Author

Closing to stay within contributor PR limit. Will resubmit with fresh rebase if the issue remains open in main.

@Bartok9 Bartok9 closed this May 27, 2026
@Bartok9 Bartok9 reopened this May 27, 2026
…ch#30252)

OpenAI-style multimodal user messages carry content as a list:
  [{"type": "text", "text": "..."}, {"type": "image_url", ...}]

Previously, sync_turn called sanitize_context(user_content or "")
directly on this list value, causing:
  TypeError: expected string or bytes-like object, got 'list'

The exception was silently swallowed by the _sync() wrapper and the
vision turn was dropped from Honcho's memory entirely.

Fix: add a _flatten_content() static method that:
  - passes plain strings through unchanged
  - joins text parts with a space
  - replaces image_url/image blocks with the literal placeholder [image]
  - uses a generic [<type>] placeholder for any unknown block types

sync_turn now accepts Union[str, list, None] for both arguments and
routes them through _flatten_content() before sanitise_context().

Fixes NousResearch#30252
@Bartok9 Bartok9 force-pushed the fix/30252-honcho-sync-turn-multimodal-content branch from d34f7d3 to b281340 Compare May 27, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have tool/memory Memory tool and memory providers type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

honcho memory: sync_turn fails on multimodal (list) content — vision turns not recorded

2 participants