Add ViDoRe V3 results: Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0#565
Conversation
…e3ComputerScienceRetrieval.json
…e3EnergyRetrieval.json
…e3FinanceEnRetrieval.json
…e3FinanceFrRetrieval.json
…e3HrRetrieval.json
…e3IndustrialRetrieval.json
…e3PharmaceuticalsRetrieval.json
…e3PhysicsRetrieval.json
There was a problem hiding this comment.
Format of resutls is not mteb format. How did you run your model?
There was a problem hiding this comment.
These were reformatted from my own eval harness output - the mteb implementation (embeddings-benchmark/mteb#4796) wasn't merged yet when I opened this. It's merged now, so I'm re-running the tasks with mteb directly and will update the files here.
Model Results ComparisonReference models: Results for
|
| task_name | is_public | Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 | Max result | Model with max result | In Training Data |
|---|---|---|---|---|---|
| Vidore3ComputerScienceRetrieval | 1.0000 | 0.7311 | 0.8092 | webAI-Official/webAI-ColVec1-9b | False |
| Vidore3EnergyRetrieval | 1.0000 | 0.6237 | 0.6982 | nvidia/nemotron-colembed-vl-8b-v2 | False |
| Vidore3FinanceEnRetrieval | 1.0000 | 0.5863 | 0.6849 | webAI-Official/webAI-ColVec1-4b | False |
| Vidore3FinanceFrRetrieval | 1.0000 | 0.4451 | 0.5372 | webAI-Official/webAI-ColVec1-9b | False |
| Vidore3HrRetrieval | 1.0000 | 0.5483 | 0.7004 | webAI-Official/webAI-ColVec1-9b | False |
| Vidore3IndustrialRetrieval | 1.0000 | 0.4605 | 0.5718 | webAI-Official/webAI-ColVec1-9b | False |
| Vidore3PharmaceuticalsRetrieval | 1.0000 | 0.6152 | 0.6732 | webAI-Official/webAI-ColVec1-9b | False |
| Vidore3PhysicsRetrieval | 1.0000 | 0.4552 | 0.5084 | nvidia/nemotron-colembed-vl-8b-v2 | False |
| Average | nan | 0.5582 | 0.6479 | nan | - |
Training datasets: JinaVDRArxivQARetrieval, JinaVDRDocQAAI, JinaVDRDocQAEnergyRetrieval, JinaVDRDocQAGovReportRetrieval, JinaVDRDocQAHealthcareIndustryRetrieval, JinaVDRDocVQARetrieval, JinaVDRInfovqaRetrieval, JinaVDRTatQARetrieval, VidoreArxivQARetrieval, VidoreDocVQARetrieval, VidoreInfoVQARetrieval, VidoreSyntheticDocQAAIRetrieval, VidoreSyntheticDocQAEnergyRetrieval, VidoreSyntheticDocQAGovernmentReportsRetrieval, VidoreSyntheticDocQAHealthcareIndustryRetrieval, VidoreTatdqaRetrieval
|
Updated to the native |
Self-reported ViDoRe V3 results (8 public retrieval tasks) for Verm1ion/ColTurk-VDR-Qwen3VL-4B-v1.0 — mean NDCG@10 = 0.5584 (full corpus, all queries, MaxSim). ModelMeta PR: embeddings-benchmark/mteb#4796.
Files:
results/Verm1ion__ColTurk-VDR-Qwen3VL-4B-v1.0/d56c7bbc278ba2fe4ac1c255fb0e55dd46b40bad/— 8 task JSONs (livedataset_revisions,mteb_version2.15.3) +model_meta.json.Checklist
mteb/models/model_implementations/, this can be as an API. Instruction on how to add a model can be found heremteb.evaluateand update if preferred.