Feat/web chat multimodal by primoco · Pull Request #177 · eullm/eullm

primoco · 2026-06-06T20:27:01Z

No description provided.

End of "multimodal CLI-only": the HTTP /api/chat endpoint now accepts the Ollama convention `{"role":"user", "content":"...", "images":[<base64>...]}` and routes the request through `engine.generate_multimodal()`, the same mtmd path that --image already used on the CLI. Load side (api/mod.rs swap_model + main.rs cmd_run): * resolve `store.mmproj_path(model)` and pass it to InferenceConfig instead of the hard-coded `None`, * when an mmproj is present, force batch_size=0 (sequential engine). The continuous-batching scheduler is text-only — it does not route mtmd chunks — so multimodal models MUST load through InferenceEngine. Vision is interactive single-user anyway, so losing batching here is not a practical regression. Documented at both sites. Dispatch side (api/routes.rs `chat` handler, feature-gated on `multimodal`): * `extract_multimodal_payload` pulls images from the last user message, accepting both raw base64 and `data:...;base64,...` prefixes, * `gemma_multimodal_prompt` wraps the user text in the Gemma chat template with `mtmd_default_marker()` placed inside the user turn (matches `run_multimodal_oneshot`), * `multimodal_to_channel` mirrors `sequential_to_channel` but calls `engine.generate_multimodal()` instead of `generate_streaming`, * the new branch runs BEFORE the text-path prompt builder; on a text-only engine it returns a 503 with an explicit message instead of silently dropping the images. Scope (deliberate, MVP): * `/api/chat` only — `/api/generate` and `/v1/chat/completions` (OpenAI `image_url`) stay text-only for now, * Gemma chat template only — switch on `template.family()` when more vision-capable families land, * the multimodal turn ignores prior chat history (one-shot probe).

Round-trips the multimodal pipeline from the embedded chat: * 📎 button beside the textarea opens a file picker (`accept="image/*"`), * the selected image is read as a data URL, the base64 payload is stripped off the `data:image/...;base64,` prefix for the wire format and a thumbnail preview bar appears above the textarea until the user sends or removes it with ×, * image-only turns are allowed (the model falls back to its default "describe this image" behaviour); fully empty submits are not, * the user bubble renders the same thumbnail above the prompt so the conversation transcript stays self-contained. Dispatch logic — `send()` picks the endpoint based on `pendingImage`: * with image → POST /api/chat (Ollama NDJSON), body shape `{ messages: [{ role:"user", content, images:[<base64>] }], stream:true }`. History is intentionally NOT replayed: the multimodal MVP is a one-shot probe (matching the backend), and re-sending old base64 bytes every turn would balloon the prompt context for nothing. * without image → POST /v1/chat/completions (SSE), unchanged. * the streaming loop handles both formats from the same `while/read`: NDJSON lines parse straight as JSON with `message.content`; SSE lines keep their `data:` prefix detection and `choices[0].delta.content`. Attachment is cleared on send and on "clear conversation" so a stale image can never leak into a later turn.

Releasing the web-chat multimodal MVP: * /api/chat routes the Ollama `images:[<base64>]` field through engine.generate_multimodal() when an mmproj is loaded, * loading a multimodal model now resolves the mmproj sibling from the store and forces sequential mode (the batching scheduler is text-only), * the embedded chat ships an attach-image button + thumbnail preview and dispatches multimodal turns over /api/chat (NDJSON) while text-only turns keep using /v1/chat/completions (SSE).

primoco added 3 commits June 6, 2026 20:13

primoco merged commit ba62695 into main Jun 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/web chat multimodal#177

Feat/web chat multimodal#177
primoco merged 3 commits into
mainfrom
feat/web-chat-multimodal

primoco commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

primoco commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant