mveb: fix and unify domain tags across all 50 source datasets by AdnanElAssadi56 · Pull Request #4738 · embeddings-benchmark/mteb

AdnanElAssadi56 · 2026-05-26T06:25:49Z

The MVEB+ video task set had inconsistent and partially-wrong domains tags.

If you add a model or a dataset, please add the corresponding checklist:

The MVEB+ video task set had inconsistent and partially-wrong `domains` tags. Issues fixed: - MSR-VTT had no domain tags at all (empty list). Now tagged ["Web"]. - AVMeme-Exam was tagged with "Music" (it's internet memes, not music content). Now ["Entertainment", "Web"]. - AudioCaps_AV was tagged "Encyclopaedic" (it's audio captioning). Now ["AudioScene", "Web"]. - VGGSound was tagged just ["Web"] despite being audio-visual events. Now ["AudioScene", "Web"]. Same fix for VGGSound_AV_RETRIEVAL. - AV-SpeakerBench was tagged ("Web") on the base task and ("Spoken") on the PC variant --- same source data, inconsistent tags. Unified to ("Spoken"). - WorldSense_1min was over-tagged with Entertainment+Music in some files and just ["Web"] in others. Unified to ["AudioScene", "Scene", "Web"]. - Several datasets tagged "Spoken" without speech-driven content (DiDeMo, MSVD, ActivityNetCaptions, VATEX, panda-70m, TUNA-Bench). Removed the Spoken tag from those. - AVE-Dataset clustering tasks tagged with ["Music", "Scene", "Spoken"] (clearly wrong). Now aligned with the rest of AVE-Dataset: ["AudioScene", "Web"]. - MELD was tagged just ["Entertainment"] across base and clustering variants; MELD is the Friends sitcom, so dialogue is central. Added "Spoken" -> ["Entertainment", "Spoken"]. - UCF101 missing "Sport" tag. UCF101 has substantial Sport content. Now ["Scene", "Sport", "Web"]. - Human-Animal-Cartoon missing "Entertainment" tag despite the cartoon domain. Now ["Entertainment", "Scene", "Web"]. - PerceptionTest missing "Scene" tag despite being a scene-perception benchmark. Now ["Scene", "Web"]. - Video-MME missing "Spoken" tag despite the narration-heavy content. Now ["Spoken", "Web"]. - HMDB51 missing "Web" tag (sourced largely from web video). Now ["Scene", "Web"]. - VideoCon, Vinoground (zachz/*) missing "Web" tag. Added. - RAVDESS tag list kept at ["Spoken"] (speech-emotion primary). - AVQA tag list extended with "AudioScene" (it's an audio-visual QA benchmark). All 50 unique source datasets across 184 video tasks now have consistent, non-empty domain tags. Verified by re-importing every task: 184 tasks load cleanly. Tags use only the existing TaskDomain Literal vocabulary in task_metadata.py; no new domains added.

KennethEnevoldsen

Didn't go through all but sampled a lot and they seem a lot better. I had qustion of the difference between audio scene and scene

…datasets Adds 5 video content domains to TaskDomain (Activity, Instructional, Egocentric, Nature, Animation) and re-tags datasets that were mislabeled or under-characterized, so the domain set actually reflects benchmark content: - Action recognition (Kinetics-400/600/700, HMDB51, UCF101, SSv2, ActivityNet, VATEX, NExT-QA, Vinoground, VideoCon) -> Activity (was the catch-all "Scene", which means visual place/setting). - Breakfast, YouCook2 -> Instructional (cooking / how-to). - Diving48 -> Activity + Sport. - EgoSchema -> Egocentric (was bare "Web"). - Human-Animal-Cartoon -> Activity + Animation + Nature. - AVMeme-Exam -> + Social (internet memes). - PerceptionTest -> drop misapplied "Scene". Scene is now reserved for genuine visual-scene content (WorldSense). All 184 video tasks load; every domain validates against TaskDomain. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

KennethEnevoldsen · 2026-06-08T12:51:00Z

@AdnanElAssadi56 should we finalize this one?

KennethEnevoldsen approved these changes May 26, 2026

View reviewed changes

Comment thread mteb/tasks/classification/eng/ave_dataset_classification.py

Comment thread mteb/tasks/classification/eng/worldsense_classification.py

AdnanElAssadi56 requested a review from Samoed June 8, 2026 14:01

Samoed approved these changes Jun 8, 2026

View reviewed changes

Samoed merged commit 343df1a into main Jun 15, 2026
13 of 20 checks passed

Samoed deleted the mveb-fix-domain-tags branch June 15, 2026 07:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mveb: fix and unify domain tags across all 50 source datasets#4738

mveb: fix and unify domain tags across all 50 source datasets#4738
Samoed merged 2 commits into
mainfrom
mveb-fix-domain-tags

AdnanElAssadi56 commented May 26, 2026

Uh oh!

KennethEnevoldsen left a comment

Uh oh!

Uh oh!

Uh oh!

KennethEnevoldsen commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

AdnanElAssadi56 commented May 26, 2026

Uh oh!

KennethEnevoldsen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

KennethEnevoldsen commented Jun 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants