Skip to content

ignoww/ZOODiP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌟 [CVPR 2025] Efficient Personalization of Quantized Diffusion Model without Backpropagation

📑 Introduction

Efficient Personalization of Quantized Diffusion Model without Backpropagation

Hoigi Seo*, Wongi Jeong*, Kyungryeol Lee, Se Young Chun (*co-first)

📚arXiv, Project Page

This paper presents a novel approach to enabling personalization with a quantized text-to-image diffusion model while operating under minimal memory constraints and without reliance on backpropagation. Leveraging zeroth-order (ZO) optimization, the proposed method achieves personalization using merely 2.37GB of VRAM on Stable Diffusion v1.5.

🚀 Usage

  1. Environment Setup

    Create and activate the Conda virtual environment:

    conda env create -f environment.yaml
    conda activate zoodip

    Alternatively, install dependencies via pip:

    pip install -r requirements.txt

    Additionally, download dreambooth dataset from here and put them in ./dataset:

  2. Folder Tree

ZOODiP
  ├── dataset
  │     ├── dreambooth dataset
  │     └── or custom dataset
  ├── results
  │     └── learned_embeds.safetensors
  ├── requirements.txt
  ├── environment.yaml
  ├── cc.json
  ├── train_zoodip.sh
  ├── train_zoodip.py
  └── inference.ipynb
  1. Configure Parameters

    The implementation is primarily based on the textual inversion code from Diffusers, with the following additional parameters:

    • n: Number of gradient estimation for ZO optimization.
    • tau: Buffer size (see Algorithm. 1).
    • nu: Threshold that controls the amount of variance retained (see Algorithm. 1).
    • use_cc: Whether to use comprehensive captioning.
  2. Run the Example

    Execute the main script train_zoodip.sh:

    sh train_zoodip.sh

    The learned embeddings will be saved in the ./results/ directory.

📸 Example Outputs

If the setup has been correctly configured and the training has been successfully completed, one can obtain images akin to those presented in ./inference.ipynb.

🙏 Acknowledgments

This code is based on the textual inversion implementation provided by Diffusers. The referenced works are as follows:

About

[CVPR 2025] Efficient Personalization of Quantized Diffusion Model without Backpropagation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors