Skip to content

[model-gateway] add phi3 vision image processor#14381

Merged
slin1237 merged 7 commits intomainfrom
smg-mm-7/n
Dec 4, 2025
Merged

[model-gateway] add phi3 vision image processor#14381
slin1237 merged 7 commits intomainfrom
smg-mm-7/n

Conversation

@slin1237
Copy link
Copy Markdown
Collaborator

@slin1237 slin1237 commented Dec 4, 2025

Checklist

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@github-actions github-actions Bot added Multi-modal multi-modal language model model-gateway labels Dec 4, 2025
@slin1237 slin1237 added the run-ci label Dec 4, 2025
@slin1237
Copy link
Copy Markdown
Collaborator Author

slin1237 commented Dec 4, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the phi3-vision model, a valuable addition. The changes are well-structured, including a new Rust image processor, updates to core data structures to handle 5D tensors, and a Python script for generating test data. The implementation is comprehensive, with good documentation and tests.

My review focuses on improving performance and robustness. I've identified several areas where ndarray operations can be made more efficient by avoiding unnecessary data copies and allocations. Additionally, I've suggested a change to make accessor methods safer by handling unexpected tensor dimensions more gracefully. These refinements will enhance the code's performance and reliability.

Comment thread sgl-router/src/multimodal/vision/image_processor.rs
Comment thread sgl-router/src/multimodal/vision/image_processor.rs Outdated
Comment thread sgl-router/src/multimodal/vision/processors/phi3_vision.rs Outdated
Comment thread sgl-router/src/multimodal/vision/processors/phi3_vision.rs Outdated
Comment on lines +403 to +406
.map_err(|e| TransformError::InvalidShape {
expected: "valid 5D shape".to_string(),
actual: vec![e.to_string().len()],
})?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message here is not very helpful for debugging. e.to_string().len() is used as the actual shape, which is incorrect. It would be more informative to include the shape that caused the failure and the underlying error message from ndarray.

            .map_err(|e| TransformError::InvalidShape {
                expected: format!("valid 5D shape, but failed with error: {}", e),
                actual: shape,
            })?

@slin1237 slin1237 merged commit 654a78f into main Dec 4, 2025
56 checks passed
@slin1237 slin1237 deleted the smg-mm-7/n branch December 4, 2025 06:32
tom-jerr pushed a commit to tom-jerr/sglang that referenced this pull request Dec 4, 2025
yingluosanqian pushed a commit to yingluosanqian/sglang that referenced this pull request Dec 4, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 5, 2025
yuchengz816-bot pushed a commit to yuchengz816-bot/sglang that referenced this pull request Dec 8, 2025
Kevin-XiongC pushed a commit to novitalabs/sglang that referenced this pull request Dec 9, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 12, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 12, 2025
tonyluj pushed a commit to openanolis/sglang that referenced this pull request Dec 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model-gateway Multi-modal multi-modal language model run-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant