fix: force gemma4_unified detection as VLM by FaisalFehad · Pull Request #1744 · jundot/omlx

FaisalFehad · 2026-06-09T00:54:45Z

Summary

Forces gemma4_unified detection as VLM in model discovery.

Problem

gemma4_unified models were detected as LLM when their config.json lacked vision_config (e.g., some quantized variants). Since unified models always include vision+audio capabilities by definition, they should never fall back to LLM.

Changes

omlx/model_discovery.py — Force gemma4_unified → VLM detection regardless of vision_config presence.

tests/test_model_discovery.py — Updated test_detect_text_only_gemma4_unified_as_llm → test_detect_gemma4_unified_without_vision_config_as_vlm since gemma4_unified is always VLM.

Notes

The feature extractor mapping in vlm.py is not changed — Gemma4UnifiedAudioFeatureExtractor correctly resolves to mlx_vlm.models.gemma4_unified.processing_gemma4_unified.Gemma4UnifiedAudioFeatureExtractor in the pinned mlx-vlm (commit 54c9a11). That class differs from Gemma4AudioFeatureExtractor (which produces mel spectrograms) — the unified extractor outputs raw waveform chunks with audio_samples_per_token=640.
gemma4_unified is an architecture type (not a size label). Currently only the 12B uses it; future models with the same unified audio+vision architecture would also use this type.
All 125 model discovery tests pass.

jundot · 2026-06-09T05:04:26Z

Thanks for the PR. The model discovery change makes sense to me, but I do not think the feature extractor mapping should change.

Gemma4UnifiedAudioFeatureExtractor does exist in the Gemma 4 unified processor path, and current mlx-vlm has it here:
https://github.com/Blaizzy/mlx-vlm/blob/c9a6743e5f059be36362d8e4a78171bbf91f18bc/mlx_vlm/models/gemma4_unified/processing_gemma4_unified.py#L251

It is not equivalent to Gemma4AudioFeatureExtractor; the unified extractor uses different defaults and output shape for unified audio inputs. Remapping the unified type to the Gemma4 extractor can break gemma4_unified audio requests.

I can merge the narrower discovery fix once that mapping change is removed or covered by a runtime-compatible test.

FaisalFehad · 2026-06-09T06:44:29Z

Thank you @jundot I have updated the PR.

Gemma4 unified models always include vision+audio capabilities, so they should never fall back to LLM detection even when vision_config is absent from config.json (e.g. some quantized variants). - Add special case in detect_model_type for gemma4_unified → VLM - Update test to reflect that gemma4_unified is always VLM

jundot · 2026-06-10T02:17:13Z

Thanks for updating this. The narrowed discovery-only change looks right: gemma4_unified should stay on the mlx-vlm path even when quantized configs omit vision_config, and the unified audio feature extractor mapping is preserved.

CI is green, so this looks good to me, and I am going to merge it.

FaisalFehad changed the title ~~fix: gemma4_unified VLM detection and feature extractor mapping~~ fix: add support for Gemma 4 12B IT (gemma4_unified) Jun 9, 2026

FaisalFehad force-pushed the fix/gemma4-unified-vlm-support branch from 001769d to 5c9d897 Compare June 9, 2026 06:24

FaisalFehad changed the title ~~fix: add support for Gemma 4 12B IT (gemma4_unified)~~ fix: force gemma4_unified detection as VLM Jun 9, 2026

FaisalFehad force-pushed the fix/gemma4-unified-vlm-support branch from 5c9d897 to 389ffff Compare June 9, 2026 06:44

FaisalFehad force-pushed the fix/gemma4-unified-vlm-support branch from 389ffff to 396c752 Compare June 9, 2026 07:00

jundot merged commit 4b71249 into jundot:main Jun 10, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: force gemma4_unified detection as VLM#1744

fix: force gemma4_unified detection as VLM#1744
jundot merged 1 commit into
jundot:mainfrom
FaisalFehad:fix/gemma4-unified-vlm-support

FaisalFehad commented Jun 9, 2026 •

edited

Loading

Uh oh!

jundot commented Jun 9, 2026

Uh oh!

FaisalFehad commented Jun 9, 2026 •

edited

Loading

Uh oh!

jundot commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

FaisalFehad commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

Notes

Uh oh!

jundot commented Jun 9, 2026

Uh oh!

FaisalFehad commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jundot commented Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FaisalFehad commented Jun 9, 2026 •

edited

Loading

FaisalFehad commented Jun 9, 2026 •

edited

Loading