Skip to content

v-gen-ai/Calibri

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Calibri:
Enhancing Diffusion Transformers via Parameter-Efficient Calibration

     

Calibri is a parameter-efficient approach that optimally calibrates Diffusion Transformer (DiT) components to elevate generative quality. By framing DiT calibration as a black-box reward optimization problem solved using the CMA-ES evolutionary algorithm, Calibri modifies just ~100 parameters. This lightweight calibration not only consistently improves generation quality across various models but also significantly reduces the required inference steps (NFE) while maintaining high-quality outputs.

📄 Changelog

2026-03-24
  • Official release of Calibri codebase! Code supports CMA-ES calibration for FLUX, Stable Diffusion 3.5, and Qwen-Image.

🤗 Supported Models & Rewards

Calibri optimizes text-to-image models by maximizing human-preference rewards. It currently supports the following DiT architectures and Reward Models:

Task Model NFE with Calibri
Text-to-Image FLUX.1-dev 15
Text-to-Image stable-diffusion-3.5-medium 30
Text-to-Image stable-diffusion-3.5-large 30
Text-to-Image Qwen-Image 30

Supported Reward Models:

  • HPSv3: Human Preference Score v3.
  • Q-Align: MLLM-based visual quality scoring.
  • PickScore: CLIP-based aesthetic scoring model.
  • ImageReward: General human preference reward.

🚀 Quick start

Environment Set Up

The framework is build with uv — an extremely fast Python package and project manager. Installation guide is at uv docs

1. Clone the repository

git clone https://github.com/your-username/Calibri.git
cd Calibri

2. Setup environment and install dependencies

uv sync
source .venv/bin/activate

Reward Preparation

To train using HPSv3 or Q-Align rewards you need to start the reward servers before running the main training script.

HPSv3 server:

uv run src/metrics/hpsv3_server.py --device cuda:0

Q-Align server:

uv run src/metrics/qalign_server.py --device cuda:1

Start Training

You can easily start the calibration process using Accelerate. The algorithm utilizes the CMA-ES evolutionary strategy to find the optimal scaling parameters.

accelerate launch --num_processes 2 scripts/train.py --config configs/calibri.py:cmaes_hpsv3_flux_layer

⚙️ Hyperparameters & Granularity

Calibri is designed to be highly flexible. You can easily customize the target DiT backbone, reward models, and optimization hyperparameters directly via configs/calibri.py.

A core feature of our framework is the ability to define the search space granularity. As described in our paper, Calibri supports three distinct levels of granularity for internal-layer calibration, allowing you to balance parameter efficiency and generation quality:

  • Block Scaling: Uniformly adjusts the outputs of Attention and MLP layers within the same block (~57 parameters).
  • Layer Scaling: Adjusts individual layers within a block using distinct coefficients (~76 parameters).
  • Gate Scaling: Specialized calibration for visual and textual tokens processed through distinct gates in MM-DiT architectures (~114 parameters).

📈 Monitoring

Track your calibration progress, reward metrics, and generated image samples in real-time with tensorboard:

tensorboard --logdir=<exp_logdir>

🤗 Acknowledgements

This repository is based on diffusers, accelerate and flow_grpo. We thank them for their contributions to the community!!!

⭐Citation

If you find Calibri useful for your research or projects, we would greatly appreciate it if you could cite the following paper:

@article{tokhchukov2026calibri,
  title={Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration}, 
  author={Tokhchukov, Danil and Mirzoeva, Aysel and Kuznetsov, Andrey and Sobolev, Konstantin},
  journal={arXiv preprint arXiv:2603.24800},
  year={2026},
}

About

[CVPR 2026] An official implementation of Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages