v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound

📗 Overview

To gauge and diagnose the capacity of multimodal large language models (MLLMs) for humor understanding, we introduce v-HUB, a novel video humor understanding benchmark. It comprises a curated collection of non-verbal short videos, reflecting real-world scenarios where humor can be appreciated purely through visual cues.

📐 Dataset Examples

🔮 Data Curation and Evaluation Pipeline

📍 Filtering

We deploy the Whisper model and only retain videos with less than 10 characters.

python ./filter/extract_speech_text.py

📍 Annotation

Our annotation platform is Label Studio, please refer to Annotation_Manual and Label Studio for setting up the platform.

📍 Evaluation

Step 1: Get the Code and Data

git clone https://github.com/spatigen/vhub.git
cd vhub
# Make sure git-lfs is installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/datasets/Foreverskyou/v-HUB

Step 2: Configure and Run

Prepare Data: Unzip the all_data.zip file located in the dataset directory you just cloned. This will create an all_data folder.
Update Paths: Open the evaluation script you wish to use (e.g., ./scripts/Text_Only/example_QA.sh). Update the VIDEO_DIR, QUESTIONS_CSV and CAND_FILE variables to the absolute paths of your dataset files.
Run Evaluation: After updating variables and installing the necessary dependencies for the model, try to execute the script.

./scripts/Text_Only/example_QA.sh

Here we provide example scripts for the three tasks under the three settings: Text-Only, Video-Only, and Video+Audio.

You can specify different tasks, such as: ['QA','explanation','matching']. And you can also specify different models, for example:['Qwen2.5-Omni','Qwen2.5-VL','Gemini2.5-flash','GPT-4o','InterVL 3.5','Minicpm 2.6-o','video SALMONN 2']

📮 Contact

If you have any questions, please feel free to contact us:

shi_zpeng@sjtu.edu.cn

yannzhao.ed@gmail.com

📝 License

v-HUB is only used for academic research. Commercial use in any form is prohibited.
It contains a collection of funny videos collected from two complementary domains.
Therefore, the copyright of all videos belongs to the video owners.
If there is any infringement in v-HUB, please email shi_zpeng@sjtu.edu.cn, and we will remove it immediately.
Without prior approval, you cannot distribute, publish, copy, disseminate, or modify v-HUB in whole or in part. 
You must strictly comply with the above restrictions.

Please send an email to shi_zpeng@sjtu.edu.cn.

✒️ Citation

If you find our work helpful for your research, please consider citing our work.

@misc{shi2026vhubbenchmarkvideohumor,
      title={v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound}, 
      author={Zhengpeng Shi and Yanpeng Zhao and Jianqun Zhou and Yuxuan Wang and Qinrong Cui and Wei Bi and Songchun Zhu and Bo Zhao and Zilong Zheng},
      year={2026},
      eprint={2509.25773},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2509.25773}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Annotation_Manual		Annotation_Manual
backbone		backbone
eval		eval
figures		figures
filter		filter
humor_benchmark		humor_benchmark
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound

📗 Overview

📐 Dataset Examples

🔮 Data Curation and Evaluation Pipeline

📍 Filtering

📍 Annotation

📍 Evaluation

📮 Contact

📝 License

✒️ Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

v-HUB: A Benchmark for Video Humor Understanding from Vision and Sound

📗 Overview

📐 Dataset Examples

🔮 Data Curation and Evaluation Pipeline

📍 Filtering

📍 Annotation

📍 Evaluation

📮 Contact

📝 License

✒️ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages