Feat/web chat multimodal by primoco · Pull Request #182 · eullm/eullm

primoco · 2026-06-09T06:38:28Z

No description provided.

…ention) The 1024-token crash exposed the real cap. Gemma's vision encoder uses non-causal attention, which requires the whole image in one micro-batch: GGML_ASSERT(causal_attn || n_ubatch >= n_tokens). Our n_ubatch defaulted to 512, so: * images > 512 tokens hard-abort (the EULLM_IMAGE_MAX_TOKENS=1024 SIGABRT), * and effective image resolution was silently capped at 512 tokens regardless of image_max_tokens — which is why 280 vs 512 gave identical output: the model never actually received more than ~512 image tokens. Fix: in the multimodal context only, set both n_batch and n_ubatch to max(config.n_batch, EULLM_IMAGE_MAX_TOKENS, 512). Now raising the image budget genuinely increases resolution instead of crashing, and we can finally test whether more resolution fixes the hard (dark / portrait) images. Text and scheduler contexts are untouched (no VRAM regression there).

Over beta.2: fix the non-causal vision attention cap — n_ubatch is now sized to the image token budget, so EULLM_IMAGE_MAX_TOKENS=1024/2048 actually raises resolution instead of crashing (SIGABRT) or being silently capped at 512 tokens. This is the build to test whether more resolution fixes the hard (dark / portrait) images.

primoco added 2 commits June 9, 2026 06:24

primoco merged commit 1c15568 into main Jun 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/web chat multimodal#182

Feat/web chat multimodal#182
primoco merged 2 commits into
mainfrom
feat/web-chat-multimodal

primoco commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

primoco commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant