feat: support qwen3(-VL) rerank scoring&chat template#16403
feat: support qwen3(-VL) rerank scoring&chat template#16403Fridge003 merged 16 commits intosgl-project:mainfrom
Conversation
Summary of ChangesHello @alphabetc1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the reranking capabilities by integrating support for the Qwen3-Reranker model. It introduces a new mechanism for handling decoder-only rerankers, allowing them to leverage logprob scoring. The changes include updates to the API protocol to accept an optional instruction field, a new chat template for Qwen3, and core logic modifications to process these new reranker types, ensuring compatibility and proper scoring. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request adds support for the Qwen3-Reranker model, which uses decoder-only log-probability scoring. This involves a new chat template, a new score_prompts helper method, and significant logic changes in the /v1/rerank endpoint handler to support this new scoring mechanism alongside existing cross-encoder models. The changes also include adding an optional instruct field to the rerank API and improving how embedding scores are handled.
My review focuses on the correctness of the new logic path for the Qwen3 reranker and the test coverage for the new functionality. I've identified a potential issue in the model detection logic that could lead to incorrect behavior for other generation models. I've also suggested expanding the test suite to cover the new Qwen3 reranker functionality, which is currently untested.
|
/tag-and-rerun-ci |
3be6eb6 to
969ce39
Compare
|
/rerun-failed-ci |
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
Working on this PR to also support qwen3-vl-reranker |
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
|
Hi @alphabetc1, I have updated this PR to incorporate support for the newly released qwen3-vl-reranker. Could you help to review it once more and also ensure that it does not disrupt your usage? Thanks~ |
- Updated documentation to include detailed descriptions of supported rerank models, specifically highlighting the Qwen3-VL-Reranker and its multimodal capabilities. - Improved the Jinja template rendering logic for handling multimodal content. - Refactored token ID retrieval for 'yes' and 'no' responses to be dynamic based on the tokenizer, enhancing compatibility across different model sizes. - Added unit tests for the Qwen3-VL reranker to ensure correct handling of logprobs and scoring. Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Signed-off-by: Xinyuan Tong <xinyuantong.cs@gmail.com>
Nice work, thanks for updating! now: |
Ok, this is caused by the qwen3_reranker.jinja update. The current behavior is more reasonable — thanks for the fix! |
|
@JustinTong0323 Just did some refactoring, PTLA |
|
/rerun-failed-ci |
1 similar comment
|
/rerun-failed-ci |
|
@mingxu please help check xpu ci failure. |
|
/rerun-failed-ci 1 |
|
/rerun-failed-ci |
Motivation
This patch:
instruct、top_n、return_documentsfield for /v1/rerank requestUsage(text rerank):
instruct、top_nandreturn_documentsare optional.):response:
Modifications
Accuracy Tests
Benchmarking and Profiling
Checklist
Review Process
/tag-run-ci-label,/rerun-failed-ci,/tag-and-rerun-ci) or contact authorized users to do so.