model: add Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 by Verm1lion · Pull Request #4796 · embeddings-benchmark/mteb

Verm1lion · 2026-06-11T13:57:33Z

Adds ColTurk-VDR-Qwen3VL-4B-v1.0 — a ColBERT-style late-interaction visual document retriever built on Qwen/Qwen3-VL-4B-Instruct (LoRA r=32, published as merged full weights; colpali-engine ColQwen3 architecture, Apache-2.0, open weights, public training code and data).

Since the model is a colpali-engine-native checkpoint (not a trust_remote_code repo), this PR also adds a small ColQwen3EngineWrapper that mirrors the existing ColQwen2_5Wrapper pattern (delegates to colpali_engine.models.ColQwen3 / ColQwen3Processor via ColPaliEngineWrapper).

ViDoRe V3 results (8 public retrieval tasks, full corpus, all queries, MaxSim) — mean NDCG@10 = 0.5584:

Task	NDCG@10
Vidore3ComputerScienceRetrieval	0.7306
Vidore3EnergyRetrieval	0.6238
Vidore3PharmaceuticalsRetrieval	0.6156
Vidore3FinanceEnRetrieval	0.5851
Vidore3HrRetrieval	0.5463
Vidore3IndustrialRetrieval	0.4624
Vidore3PhysicsRetrieval	0.4564
Vidore3FinanceFrRetrieval	0.4467

Eval code & raw result JSONs (with seeded bootstrap 95% CIs): https://github.com/Verm1lion/ColTurk-VDR — results PR: embeddings-benchmark/results#565 · private-split request: #4797.

I have filled out the ModelMeta object to the extent possible
I have ensured that my model can be loaded using
- mteb.get_model(model_name, revision) and
- mteb.get_model_meta(model_name, revision)
I have tested the implementation works on a representative set of tasks.
The model is public, i.e., is available either as an API or the weights are publicly available to download
I reproduced results from the original paper (if applicable) on at least one benchmark, and I am including the results in the PR description.

Notes for reviewers:

The wrapper's load path (ColQwen3.from_pretrained + ColQwen3Processor.from_pretrained on the published repo) was smoke-verified end-to-end on the exact revision pinned in the ModelMeta.
The results above were produced with the published self-contained harness (identical loading + MaxSim over the full corpora, all queries). Happy to re-run anything via mteb.evaluate on request.
Training data (the ColPali train set) vs. ViDoRe V3 contamination was checked empirically (perceptual-hash scan over train images x V3 corpora: 0 exact duplicates; report in the linked repo).

…riever)

Samoed · 2026-06-11T17:24:37Z

+    release_date="2026-06-11",
+    modalities=["image", "text"],
+    n_parameters=4_505_515_136,
+    n_embedding_parameters=None,


Suggested change

n_embedding_parameters=None,

n_embedding_parameters=388_956_160,

Verm1lion · 2026-06-11T18:39:17Z

Applied both suggestions, thanks for the quick review!

KennethEnevoldsen

I don't see any outstanding issues

Add ColTurk-VDR-Qwen3VL-4B-v1.0 (late-interaction visual document ret…

7b73820

…riever)

This was referenced Jun 11, 2026

Add ViDoRe V3 results: Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 embeddings-benchmark/results#565

Merged

ViDoRe V3 private-split evaluation request: ColTurk-VDR-Qwen3VL-4B-v1.0 #4797

Closed

Samoed reviewed Jun 11, 2026

View reviewed changes

Apply review suggestions: n_embedding_parameters + adapted_from

73d0dfe

KennethEnevoldsen approved these changes Jun 11, 2026

View reviewed changes

KennethEnevoldsen enabled auto-merge (squash) June 11, 2026 18:50

KennethEnevoldsen merged commit 7ea3685 into embeddings-benchmark:main Jun 11, 2026
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

model: add Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0#4796

model: add Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0#4796
KennethEnevoldsen merged 2 commits into
embeddings-benchmark:mainfrom
Verm1lion:Verm1lion-patch-1

Verm1lion commented Jun 11, 2026 •

edited

Loading

Uh oh!

Samoed Jun 11, 2026

Uh oh!

Uh oh!

Verm1lion commented Jun 11, 2026

Uh oh!

KennethEnevoldsen left a comment •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	n_embedding_parameters=None,
	n_embedding_parameters=388_956_160,

Uh oh!

Conversation

Verm1lion commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Samoed Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Verm1lion commented Jun 11, 2026

Uh oh!

KennethEnevoldsen left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Verm1lion commented Jun 11, 2026 •

edited

Loading

KennethEnevoldsen left a comment •

edited

Loading