Add ViDoRe V3 results: Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 by Verm1lion · Pull Request #565 · embeddings-benchmark/results

Verm1lion · 2026-06-11T13:58:32Z

Self-reported ViDoRe V3 results (8 public retrieval tasks) for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 — mean NDCG@10 = 0.5584 (full corpus, all queries, MaxSim). ModelMeta PR: embeddings-benchmark/mteb#4796.

Files: results/Verm1ion__ColTurk-VDR-Qwen3VL-4B-v1.0/d56c7bbc278ba2fe4ac1c255fb0e55dd46b40bad/ — 8 task JSONs (live dataset_revisions, mteb_version 2.15.3) + model_meta.json.

Checklist

My model has a model sheet, report, or similar — https://huggingface.co/Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0
My model has a reference implementation in mteb/models/model_implementations/, this can be as an API. Instruction on how to add a model can be found here
- No, but there is an existing PR model: add Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 mteb#4796
The results submitted are obtained using the reference implementation
- Note: results were produced with the published, self-contained eval harness (identical model-loading path + MaxSim scoring over the full corpora, all queries; seeded bootstrap 95% CIs): https://github.com/Verm1lion/ColTurk-VDR/blob/main/scripts/eval/eval_colturk_checkpoint.py — raw JSONs in that repo. Happy to re-run via mteb.evaluate and update if preferred.
My model is available, either as a publicly accessible API or publicly on e.g., Huggingface
I solemnly swear that for all results submitted I have not trained on the evaluation dataset including training splits. If I have, I have disclosed it clearly.
- Training data (ColPali train set) vs. ViDoRe V3 contamination was additionally checked empirically: perceptual-hash scan over training images x all 8 V3 corpora found 0 exact duplicates (report: https://github.com/Verm1lion/ColTurk-VDR/blob/main/STAGE1_VALIDITY_REPORT.md).

…e3ComputerScienceRetrieval.json

…e3EnergyRetrieval.json

…e3FinanceEnRetrieval.json

…e3FinanceFrRetrieval.json

…e3HrRetrieval.json

…e3IndustrialRetrieval.json

…e3PharmaceuticalsRetrieval.json

…e3PhysicsRetrieval.json

…_meta.json

Samoed · 2026-06-12T10:55:56Z

Format of resutls is not mteb format. How did you run your model?

These were reformatted from my own eval harness output - the mteb implementation (embeddings-benchmark/mteb#4796) wasn't merged yet when I opened this. It's merged now, so I'm re-running the tasks with mteb directly and will update the files here.

…V3 public tasks)

github-actions · 2026-06-13T17:58:12Z

Model Results Comparison

Reference models: intfloat/multilingual-e5-large, google/gemini-embedding-001
New models evaluated: Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0
Tasks: Vidore3ComputerScienceRetrieval, Vidore3EnergyRetrieval, Vidore3FinanceEnRetrieval, Vidore3FinanceFrRetrieval, Vidore3HrRetrieval, Vidore3IndustrialRetrieval, Vidore3PharmaceuticalsRetrieval, Vidore3PhysicsRetrieval

Results for `Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0`

task_name	is_public	Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0	Max result	Model with max result	In Training Data
Vidore3ComputerScienceRetrieval	1.0000	0.7311	0.8092	webAI-Official/webAI-ColVec1-9b	False
Vidore3EnergyRetrieval	1.0000	0.6237	0.6982	nvidia/nemotron-colembed-vl-8b-v2	False
Vidore3FinanceEnRetrieval	1.0000	0.5863	0.6849	webAI-Official/webAI-ColVec1-4b	False
Vidore3FinanceFrRetrieval	1.0000	0.4451	0.5372	webAI-Official/webAI-ColVec1-9b	False
Vidore3HrRetrieval	1.0000	0.5483	0.7004	webAI-Official/webAI-ColVec1-9b	False
Vidore3IndustrialRetrieval	1.0000	0.4605	0.5718	webAI-Official/webAI-ColVec1-9b	False
Vidore3PharmaceuticalsRetrieval	1.0000	0.6152	0.6732	webAI-Official/webAI-ColVec1-9b	False
Vidore3PhysicsRetrieval	1.0000	0.4552	0.5084	nvidia/nemotron-colembed-vl-8b-v2	False
Average	nan	0.5582	0.6479	nan	-

Training datasets: JinaVDRArxivQARetrieval, JinaVDRDocQAAI, JinaVDRDocQAEnergyRetrieval, JinaVDRDocQAGovReportRetrieval, JinaVDRDocQAHealthcareIndustryRetrieval, JinaVDRDocVQARetrieval, JinaVDRInfovqaRetrieval, JinaVDRTatQARetrieval, VidoreArxivQARetrieval, VidoreDocVQARetrieval, VidoreInfoVQARetrieval, VidoreSyntheticDocQAAIRetrieval, VidoreSyntheticDocQAEnergyRetrieval, VidoreSyntheticDocQAGovernmentReportsRetrieval, VidoreSyntheticDocQAHealthcareIndustryRetrieval, VidoreTatdqaRetrieval

Verm1lion · 2026-06-13T18:03:38Z

Updated to the native mteb output — the earlier files were reformatted from my own harness, which was the mismatch you spotted. The 8 ViDoRe V3 task JSONs are now generated by mteb.evaluate (mteb 2.15.4) with the registered ColQwen3EngineWrapper, one row per language subset. All checks are green — could you take another look? Thanks @Samoed!

Verm1lion added 9 commits June 11, 2026 16:58

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

1bf5cae

…e3ComputerScienceRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

5e87c36

…e3EnergyRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

2f15514

…e3FinanceEnRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

dd5cbc5

…e3FinanceFrRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

3dd9e5a

…e3HrRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

50c6542

…e3IndustrialRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

45b1fb3

…e3PharmaceuticalsRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: Vidor…

a19ed20

…e3PhysicsRetrieval.json

Add ViDoRe V3 results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0: model…

b543f27

…_meta.json

This was referenced Jun 11, 2026

ViDoRe V3 private-split evaluation request: ColTurk-VDR-Qwen3VL-4B-v1.0 embeddings-benchmark/mteb#4797

Closed

model: add Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 embeddings-benchmark/mteb#4796

Merged

model_meta: set n_embedding_parameters (sync with mteb#4796 review)

1051218

KennethEnevoldsen added the waiting for review of implementation This PR is waiting for an implementation review before merging the results. label Jun 11, 2026

Samoed requested changes Jun 12, 2026

View reviewed changes

Replace self-reported JSONs with native mteb 2.15.4 output (8 ViDoRe …

fb66f7b

…V3 public tasks)

Samoed approved these changes Jun 13, 2026

View reviewed changes

Samoed merged commit 6777baf into embeddings-benchmark:main Jun 13, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ViDoRe V3 results: Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0#565

Add ViDoRe V3 results: Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0#565
Samoed merged 11 commits into
embeddings-benchmark:mainfrom
Verm1lion:add-colturk-vdr-results

Verm1lion commented Jun 11, 2026

Uh oh!

Samoed Jun 12, 2026

Uh oh!

Verm1lion Jun 12, 2026

Uh oh!

github-actions Bot commented Jun 13, 2026

Uh oh!

Verm1lion commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Verm1lion commented Jun 11, 2026

Checklist

Uh oh!

Samoed Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

Verm1lion Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 13, 2026

Model Results Comparison

Results for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0

Uh oh!

Verm1lion commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Results for `Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0`