Skip to content

mveb: fix and unify domain tags across all 50 source datasets#4738

Merged
Samoed merged 2 commits into
mainfrom
mveb-fix-domain-tags
Jun 15, 2026
Merged

mveb: fix and unify domain tags across all 50 source datasets#4738
Samoed merged 2 commits into
mainfrom
mveb-fix-domain-tags

Conversation

@AdnanElAssadi56

Copy link
Copy Markdown
Contributor

The MVEB+ video task set had inconsistent and partially-wrong domains tags.

If you add a model or a dataset, please add the corresponding checklist:

The MVEB+ video task set had inconsistent and partially-wrong `domains`
tags. Issues fixed:

- MSR-VTT had no domain tags at all (empty list). Now tagged ["Web"].
- AVMeme-Exam was tagged with "Music" (it's internet memes, not music
  content). Now ["Entertainment", "Web"].
- AudioCaps_AV was tagged "Encyclopaedic" (it's audio captioning). Now
  ["AudioScene", "Web"].
- VGGSound was tagged just ["Web"] despite being audio-visual events.
  Now ["AudioScene", "Web"]. Same fix for VGGSound_AV_RETRIEVAL.
- AV-SpeakerBench was tagged ("Web") on the base task and ("Spoken")
  on the PC variant --- same source data, inconsistent tags. Unified
  to ("Spoken").
- WorldSense_1min was over-tagged with Entertainment+Music in some
  files and just ["Web"] in others. Unified to ["AudioScene", "Scene",
  "Web"].
- Several datasets tagged "Spoken" without speech-driven content
  (DiDeMo, MSVD, ActivityNetCaptions, VATEX, panda-70m, TUNA-Bench).
  Removed the Spoken tag from those.
- AVE-Dataset clustering tasks tagged with ["Music", "Scene", "Spoken"]
  (clearly wrong). Now aligned with the rest of AVE-Dataset:
  ["AudioScene", "Web"].
- MELD was tagged just ["Entertainment"] across base and clustering
  variants; MELD is the Friends sitcom, so dialogue is central.
  Added "Spoken" -> ["Entertainment", "Spoken"].
- UCF101 missing "Sport" tag. UCF101 has substantial Sport content.
  Now ["Scene", "Sport", "Web"].
- Human-Animal-Cartoon missing "Entertainment" tag despite the cartoon
  domain. Now ["Entertainment", "Scene", "Web"].
- PerceptionTest missing "Scene" tag despite being a scene-perception
  benchmark. Now ["Scene", "Web"].
- Video-MME missing "Spoken" tag despite the narration-heavy content.
  Now ["Spoken", "Web"].
- HMDB51 missing "Web" tag (sourced largely from web video). Now
  ["Scene", "Web"].
- VideoCon, Vinoground (zachz/*) missing "Web" tag. Added.
- RAVDESS tag list kept at ["Spoken"] (speech-emotion primary).
- AVQA tag list extended with "AudioScene" (it's an audio-visual QA
  benchmark).

All 50 unique source datasets across 184 video tasks now have
consistent, non-empty domain tags. Verified by re-importing every
task: 184 tasks load cleanly.

Tags use only the existing TaskDomain Literal vocabulary in
task_metadata.py; no new domains added.

@KennethEnevoldsen KennethEnevoldsen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't go through all but sampled a lot and they seem a lot better. I had qustion of the difference between audio scene and scene

Comment thread mteb/tasks/classification/eng/ave_dataset_classification.py
Comment thread mteb/tasks/classification/eng/worldsense_classification.py
…datasets

Adds 5 video content domains to TaskDomain (Activity, Instructional,
Egocentric, Nature, Animation) and re-tags datasets that were mislabeled
or under-characterized, so the domain set actually reflects benchmark
content:

- Action recognition (Kinetics-400/600/700, HMDB51, UCF101, SSv2,
  ActivityNet, VATEX, NExT-QA, Vinoground, VideoCon) -> Activity
  (was the catch-all "Scene", which means visual place/setting).
- Breakfast, YouCook2 -> Instructional (cooking / how-to).
- Diving48 -> Activity + Sport.
- EgoSchema -> Egocentric (was bare "Web").
- Human-Animal-Cartoon -> Activity + Animation + Nature.
- AVMeme-Exam -> + Social (internet memes).
- PerceptionTest -> drop misapplied "Scene".

Scene is now reserved for genuine visual-scene content (WorldSense).
All 184 video tasks load; every domain validates against TaskDomain.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@KennethEnevoldsen

Copy link
Copy Markdown
Contributor

@AdnanElAssadi56 should we finalize this one?

@AdnanElAssadi56 AdnanElAssadi56 requested a review from Samoed June 8, 2026 14:01
@Samoed Samoed merged commit 343df1a into main Jun 15, 2026
13 of 20 checks passed
@Samoed Samoed deleted the mveb-fix-domain-tags branch June 15, 2026 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants