OpenSubject is a video-derived large-scale corpus for subject-driven generation and manipulation.
- [2025-12] OpenSubject released with OSBench evaluation benchmark
- [2025-12] Dataset and model weights available on Hugging Face
git clone https://github.com/LAW1223/OpenSubject.gitconda create -n opensubject python=3.11
conda activate opensubjectpip install -r requirements.txt# Note: Version 2.7.4.post1 is specified for compatibility with CUDA 12.4.
# Feel free to use a newer version if you use CUDA 12.6 or they fixed this compatibility issue.
# OmniGen2 runs even without flash-attn, though we recommend installing it for best performance.
pip install flash-attn==2.7.4.post1 --no-build-isolationDownload the OpenSubject dataset from Hugging Face:
python scripts/hf_scripts/download_hf.py \
--repo_id AIPeanutman/OpenSubject \
--repo_type dataset \
--local_dir /path/to/opensubject_datasetAfter downloading, extract the image packages to restore the original directory structure:
python scripts/unzip_images/extract_images.py \
--packages_dir /path/to/opensubject_dataset/Images_packages \
--output_dir /path/to/opensubject_dataset/Images \
--num_workers 32This will extract all tar.gz files and restore the directory structure:
Images/
βββ generation/
β βββ input_images/
β βββ output_images/
βββ manipulation/
βββ input_images/
βββ output_images/
Download the OSBench evaluation benchmark:
python scripts/hf_scripts/download_hf.py \
--repo_id AIPeanutman/OSBench \
--repo_type dataset \
--local_dir /path/to/osbenchDownload the base omnigen2 model weights:
python scripts/hf_scripts/download_hf.py \
--repo_id OmniGen2/OmniGen2 \
--repo_type model \
--local_dir /path/to/omnigen2_modelDownload the fine-tuned OpenSubject model weights:
python scripts/hf_scripts/download_hf.py \
--repo_id AIPeanutman/OpenSubject \
--repo_type model \
--local_dir /path/to/opensubject_modelThe CLI tool (scripts/inference_cli.py) allows you to generate images directly from the command line.
Generate an image from a text prompt:
python scripts/inference_cli.py \
--model_path /path/to/omnigen2_model \
--transformer_path /path/to/opensubject_model \
--prompt "a beautiful landscape with mountains and lakes" \
--output_path output.png \
--num_inference_step 50 \
--height 1024 \
--width 1024Generate an image with reference input images:
python scripts/inference_cli.py \
--model_path /path/to/omnigen2_model \
--transformer_path /path/to/opensubject_model \
--prompt "transform the scene to sunset" \
--input_images input1.jpg input2.jpg \
--output_path result.png \
--num_inference_step 50--model_path: Path to the model checkpoint (required)--transformer_path: Path to transformer checkpoint (optional, uses model_path/transformer if not specified)--prompt: Text prompt for image generation (required)--input_images: Input image paths or directory (optional, for image-to-image tasks)--output_path: Path to save generated image(s) (default:output.png)--num_inference_step: Number of inference steps (default: 50)--height/--width: Output image dimensions (default: 1024x1024)--text_guidance_scale: Text guidance scale (default: 5.0)--image_guidance_scale: Image guidance scale (default: 2.0)--num_images_per_prompt: Number of images to generate (default: 1)--seed: Random seed for reproducibility (default: 0)--scheduler: Scheduler type -eulerordpmsolver(default:euler)--dtype: Data type -fp32,fp16, orbf16(default:bf16)--disable_align_res: Disable resolution alignment to input images
The evaluation pipeline consists of two steps:
- GPT-4 Based Scoring: Uses GPT-4.1 to evaluate generated images
- Statistics Calculation: Computes final metrics
For convenience, we provide a complete inference and evaluation script at scripts/inference.sh. You can directly use this script by simply modifying the model and data paths:
# Edit the following variables in scripts/eval.sh:
# - model_path: Path to base OmniGen2 model
# - transformer_path: Path to OpenSubject fine-tuned transformer
# - test_data: Path to OSBench dataset
# - output_dir: Directory to save results
# - openai_key: Your OpenAI API key for evaluation
bash scripts/eval.shThis script will automatically run inference and evaluation. For more detailed control, follow the manual steps below.
Note: we fix the resolution of the output images at 720 Γ 1280 to ensure that the settings are consistent across different models.
You may try generating results using OmniGen2 or other models; please ensure that the output image directory structure and format are consistent with the format specified below.
results/
βββ {method_name}/
β βββ fullset/
β βββ {task_type}/
β βββ key1.png
β βββ key2.png
β βββ ...
To use Opensubject, you can run the following script to generate images:
accelerate launch --num_processes=8 -m osbench.inference \
--model_path /path/to/omnigen2_model \
--transformer_path /path/to/opensubject_model \
--test_data /path/to/osbench \
--result_dir /path/to/results \
--num_inference_step 50 \
--height 720 \
--width 1280 \
--text_guidance_scale 5.0 \
--image_guidance_scale 2.0 \
--num_images_per_prompt 1 \
--disable_align_res - We use GPT-4.1 to evaluate the quality of the generated images. Please make sure to set up your API key before running the script.
openai_key="<Your-API-Key>"
python -m osbench.test_osbench_score \
--test_data /path/to/osbench \
--result_dir /path/to/results \
--model_name "OmniGen2" \
--openai_key ${openai_key} \
--max_workers 16- Next, calculate the final score:
python -m osbench.calculate_statistics \
--save_path /path/to/results \
--model_name "OmniGen2" \
--backbone gpt4dot1Part of the code is based upon OmniGen2. Thanks for their great work!
If you find this work useful, please consider citing:
@article{liu2025opensubjectleveragingvideoderivedidentity,
title={OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for Subject-driven Image Generation and Manipulation},
author={Yexin Liu and Manyuan Zhang and Yueze Wang and Hongyu Li and Dian Zheng and Weiming Zhang and Changsheng Lu and Xunliang Cai and Yan Feng and Peng Pei and Harry Yang},
year={2025},
eprint={2512.08294},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.08294},
}