Skip to content

feat(runtime/claude): multimodal I/O support — accept inbound images (#396)#522

Merged
alexey-pelykh merged 1 commit intomainfrom
feat/396-claude-multimodal
Mar 9, 2026
Merged

feat(runtime/claude): multimodal I/O support — accept inbound images (#396)#522
alexey-pelykh merged 1 commit intomainfrom
feat/396-claude-multimodal

Conversation

@alexey-pelykh
Copy link
Copy Markdown

Summary

  • Add inbound image support to Claude CLI runtime via --input-format stream-json stdin mode
  • Add buildStdinPayload() hook to CLIRuntimeBase for custom stdin delivery
  • prepareMedia() reads image files from disk and populates base64 fields
  • Non-image media (audio/video) is filtered out; only image/* is accepted
  • mediaCapabilities declares acceptsInbound: ["image/"], emitsOutbound: false

Test plan

  • Unit tests: buildArgs with image media → -p --input-format stream-json (no --print)
  • Unit tests: buildArgs without media → --print (backwards compat)
  • Unit tests: buildArgs with non-image media → --print (ignores audio/video)
  • Unit tests: buildStdinPayload constructs correct { type: "user", message: {...} } envelope
  • Unit tests: buildStdinPayload returns undefined for non-image/missing media
  • Unit tests: mediaCapabilities reports correct values
  • Live smoke test: send 100x100 red PNG → Claude responds "Red"
  • Live smoke test: text-only prompt still works (existing test)
  • Live smoke test: session resumption still works (existing test)
  • All 47 unit tests pass, all 3 live smoke tests pass
  • Typecheck, lint, format all clean

Closes #396

🤖 Generated with Claude Code

…396)

Switch from --print to --input-format stream-json stdin mode when image
media is present, constructing user message envelopes with base64 image
content blocks. Non-image media (audio/video) is silently filtered out
as Claude CLI only supports images natively.

Changes:
- Add buildStdinPayload() hook to CLIRuntimeBase for custom stdin delivery
- Override buildStdinPayload() in ClaudeCliRuntime for stream-json envelopes
- Add prepareMedia() to read image files and populate base64 fields
- Switch buildArgs() to use -p --input-format stream-json for image media
- Add unit tests for buildArgs, buildStdinPayload, mediaCapabilities
- Add live smoke test with 100x100 red PNG verifying Claude describes it

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@alexey-pelykh alexey-pelykh enabled auto-merge (squash) March 9, 2026 09:34
@alexey-pelykh alexey-pelykh merged commit 8eea450 into main Mar 9, 2026
7 checks passed
@alexey-pelykh alexey-pelykh deleted the feat/396-claude-multimodal branch March 9, 2026 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(runtime/claude): multimodal I/O support

1 participant