Problem
Gemini CLI runtime (src/middleware/runtimes/gemini.ts) currently only handles text prompts. Gemini has the best multimodal support of all runtimes — it can natively handle images, audio, video, and PDF via @path inline syntax.
Current state
- Uses
--output-format stream-json --prompt for invocation
buildEnv() returns {}
- No media handling
Implementation
Inbound (images, audio, video, PDF)
- Save
MediaAttachment content to temp files when params.media is present
- Inject
@/path/to/media.jpg references inline in the prompt text
- Gemini CLI resolves
@path references and includes the media natively
- Declare capability:
acceptsInbound: ["image/", "audio/", "video/", "application/pdf"]
Outbound
- No native media emission capability
emitsOutbound: false
Capability declaration
readonly mediaCapabilities = {
acceptsInbound: ["image/", "audio/", "video/", "application/pdf"],
emitsOutbound: false,
};
Temp file management
- Create temp directory before execution
- Write media attachments as temp files with appropriate extensions
- Clean up temp directory after
done event (or on error)
Tests
Unit tests (src/middleware/runtimes/gemini.test.ts)
Live smoke tests (src/middleware/__smoke__/gemini.live.test.ts)
Depends on
Related
Problem
Gemini CLI runtime (
src/middleware/runtimes/gemini.ts) currently only handles text prompts. Gemini has the best multimodal support of all runtimes — it can natively handle images, audio, video, and PDF via@pathinline syntax.Current state
--output-format stream-json --promptfor invocationbuildEnv()returns{}Implementation
Inbound (images, audio, video, PDF)
MediaAttachmentcontent to temp files whenparams.mediais present@/path/to/media.jpgreferences inline in the prompt text@pathreferences and includes the media nativelyacceptsInbound: ["image/", "audio/", "video/", "application/pdf"]Outbound
emitsOutbound: falseCapability declaration
Temp file management
doneevent (or on error)Tests
Unit tests (
src/middleware/runtimes/gemini.test.ts)execute()with image attachment creates temp file and injects@pathin promptexecute()with audio attachment creates temp file and injects@pathexecute()with multiple attachments creates multiple temp filesexecute()without media uses existing prompt path (backwards compat)mediaCapabilitiesreports correct valuesLive smoke tests (
src/middleware/__smoke__/gemini.live.test.ts)Depends on
Related