Skip to content

Fix FP16 overflow for vision tensors (Fixes #1678)#1682

Merged
jundot merged 1 commit into
jundot:mainfrom
dodams258:feature/fix-oq-vision-fp16-overflow
Jun 5, 2026
Merged

Fix FP16 overflow for vision tensors (Fixes #1678)#1682
jundot merged 1 commit into
jundot:mainfrom
dodams258:feature/fix-oq-vision-fp16-overflow

Conversation

@dodams258

@dodams258 dodams258 commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Fixes #1678.

Summary
oQ: vision and audio tensors are kept in float32 instead of float16

Test
pytest tests/test_oq.py -- 206 passed

@jundot

jundot commented Jun 5, 2026

Copy link
Copy Markdown
Owner

Thanks for the patch. The root cause looks right: protected vision/audio tensors should not be downcast to FP16 in the float16 oQ path.

One gap is that this patch only changes the _should_quantize_tensor() == false fallback. The main 2D vision/audio weight tensors still go through _should_quantize_tensor() == true, then _get_predicate_bits() returns None, so they hit the existing bits is None fallback and are still cast to target_dtype.

I will handle the remaining part in a follow-up commit: apply the same protected pass-through dtype policy to both fallback paths and add a regression test for a 2D vision/audio tensor with dtype="float16".

@jundot jundot merged commit 1ae8919 into jundot:main Jun 5, 2026
@dodams258 dodams258 deleted the feature/fix-oq-vision-fp16-overflow branch June 5, 2026 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Infinite <pad> generation with image inputs (FP16 Overflow in Vision Tensors when using float16 via oQ)

2 participants