model: Add VIRTUE multimodal embedding models (Sony VIRTUE-2B/7B-SCaR) by fzowl · Pull Request #4822 · embeddings-benchmark/mteb

fzowl · 2026-06-17T16:06:02Z

Adds the Sony VIRTUE universal text-image embedders (VIRTUE-2B-SCaR and VIRTUE-7B-SCaR), built on Qwen2-VL. The wrapper uses left-padding last-token pooling with L2 normalization and supports text-only, image-only, and fused image+text inputs, matching the no-visual-prompt path of the reference implementation. A smoke evaluation on AILAStatutes ran successfully with finite scores. Fixes #4517.

Samoed · 2026-06-17T16:31:31Z

Can you try to run vidore v1&v2 tasks to reproduce scores?

fzowl · 2026-06-17T17:32:42Z

Sure, I'll run ViDoRe v1 & v2 with both checkpoints and post the scores here.

fzoll · 2026-06-17T18:30:19Z

Re: ViDoRe — I'll need to queue this on GPU and will post scores in a follow-up.

Re: CI — the test and 3.13 failures don't reproduce locally (both latest and lowest deps pass). Could you share the failure logs or re-trigger the runs?

Samoed · 2026-06-17T18:36:48Z

Re: CI — the test and 3.13 failures don't reproduce locally (both latest and lowest deps pass). Could you share the failure logs or re-trigger the runs?

This is fine. It's just a flaky test. I think you can also check it yourself

fzoll

Done in 83d8c76 — moved show_progress_bar to an explicit function arg.

fzowl · 2026-06-17T20:15:28Z

ViDoRe(v1&v2) results for both VIRTUE checkpoints:

Task	VIRTUE-2B-SCaR ndcg@5	VIRTUE-7B-SCaR ndcg@5
VidoreArxivQARetrieval	0.0114	0.1818
VidoreDocVQARetrieval	0.0052	0.1095
VidoreInfoVQARetrieval	0.0163	0.5171
VidoreTabfquadRetrieval	0.0423	0.2589
VidoreTatdqaRetrieval	0.0157	0.0418
VidoreShiftProjectRetrieval	0.0000	0.1361
VidoreSyntheticDocQAAIRetrieval	0.0000	0.1568
VidoreSyntheticDocQAEnergyRetrieval	0.0163	0.2577
VidoreSyntheticDocQAGovernmentReportsRetrieval	0.0356	0.1854
VidoreSyntheticDocQAHealthcareIndustryRetrieval	0.0113	0.2330
Vidore2ESGReportsRetrieval	0.0371	0.1762
Vidore2ESGReportsHLRetrieval	0.0090	0.1390
Vidore2EconomicsReportsRetrieval	0.0000	0.1641
Vidore2BioMedicalLecturesRetrieval	0.0035	0.0558

Samoed · 2026-06-17T20:40:45Z

Hm, seems they don't report results on vidore even they evaluated on MMEB which have vidore 1&2 in subtasks. I think we can merge then

Add VIRTUE multimodal embedding models (Sony VIRTUE-2B/7B-SCaR)

62ccb65

Samoed reviewed Jun 17, 2026

View reviewed changes

Comment thread mteb/models/model_implementations/virtue_models.py Outdated

address review feedback

83d8c76

fzoll reviewed Jun 17, 2026

View reviewed changes

Samoed added the new model Questions related to adding a new model to the benchmark label Jun 17, 2026

Samoed approved these changes Jun 17, 2026

View reviewed changes

Samoed merged commit 41f6c7e into embeddings-benchmark:main Jun 17, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

model: Add VIRTUE multimodal embedding models (Sony VIRTUE-2B/7B-SCaR)#4822

model: Add VIRTUE multimodal embedding models (Sony VIRTUE-2B/7B-SCaR)#4822
Samoed merged 2 commits into
embeddings-benchmark:mainfrom
fzowl:fix/issue-4517

fzowl commented Jun 17, 2026

Uh oh!

Samoed commented Jun 17, 2026

Uh oh!

fzowl commented Jun 17, 2026

Uh oh!

Uh oh!

fzoll commented Jun 17, 2026

Uh oh!

Samoed commented Jun 17, 2026 •

edited

Loading

Uh oh!

fzoll left a comment

Uh oh!

fzowl commented Jun 17, 2026

Uh oh!

Samoed commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

fzowl commented Jun 17, 2026

Uh oh!

Samoed commented Jun 17, 2026

Uh oh!

fzowl commented Jun 17, 2026

Uh oh!

Uh oh!

fzoll commented Jun 17, 2026

Uh oh!

Samoed commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fzoll left a comment

Choose a reason for hiding this comment

Uh oh!

fzowl commented Jun 17, 2026

Uh oh!

Samoed commented Jun 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Samoed commented Jun 17, 2026 •

edited

Loading