Video Parallel Scaling: Scaling VideoLLMs by Aggregating Diverse Frame Subsets

Authors: Hyungjin Chung, Hyelin Nam, Jiyeon Kim, Hyojun Go, Byeongjun Park, Junho Kim, Joonseok Lee, Seongsu Ha, and Byung-Hoon Kim

This repo contains the official implementation of the paper "Video Parallel Scaling: Aggregating Diverse Frame Subsets for VideoLLMs".

Demo

Preparing the environment

You can set up the environment using

pip install -r requirements.txt

Note that we used CUDA 12.8 in all our experiments.

Benchmarks

./examples contains a demo video entire_003.mp4 from the EventHallusion benchmark. Our demo runs VPS on this example video. If you wish to try out other videos in Video-MME or EventHallusion, prepare the data accordingly.

Running the demo

You can simply run

./scripts/demo.sh

to run the demo. We offer three different model classes

model=
gemma-3-{}b-it
Qwen2.5-VL-{}B-Instruct
InternVL3-{}B

You may choose from existing model sizes that are uploaded on huggingface. You may also change the number of frames to be sampled from the raw video.

The following command will run the baseline VideoLLM

# Baseline
python demo.py \
    --model_name ${model} \
    --video_path ${video_path} \
    --num_frames ${num_frames} \
    --prompt "Summarize the video in one sentence."

whereas the following will run VPS with $J = $ num_parallel_inputs - 1

# VPS
python demo.py \
    --model_name ${model} \
    --video_path ${video_path} \
    --num_frames ${num_frames} \
    --prompt "Summarize the video in one sentence." \
    --use_vps \
    --num_parallel_inputs 2

Citation

If you find our work interesting, please consider citing

@article{chung2025video,
  title={Video Parallel Scaling: Aggregating Diverse Frame Subsets for VideoLLMs},
  author={Chung, Hyungjin and Nam, Hyelin and Kim, Jiyeon and Go, Hyojun and Park, Byeongjun and Kim, Junho and Lee, Joonseok and Ha, Seongsu and Kim, Byung-Hoon},
  year={2025},
  journal={arXiv preprint arXiv:2509.08016},
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
benchmarks		benchmarks
custom_models		custom_models
examples		examples
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Parallel Scaling: Scaling VideoLLMs by Aggregating Diverse Frame Subsets

Demo

Preparing the environment

Benchmarks

Running the demo

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Video Parallel Scaling: Scaling VideoLLMs by Aggregating Diverse Frame Subsets

Demo

Preparing the environment

Benchmarks

Running the demo

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages