Project page | Arxiv | Demo
Install dependencies using:
python -m pip install -r requirements.txt- 10.08.25: Code release for compressed posterior sampling, compressed blind face image restoration & compressed text-based image editing, along with the extra experiments shown in the paper.
- 27.07.25: Initial release with code for latent compression.
Our code supports compressing images of size
Run compression / decompression / roundtrip:
python latent_compression.py compress|decompress|roundtrip [OPTIONS]You should specify the following arguments:
-
--model_id: HuggingFace model ID. Choose betweenstabilityai/stable-diffusion-2-1(for images of size$768^2$ ) andstabilityai/stable-diffusion-2-1-baseorCompVis/stable-diffusion-v1-4(for images of size$512^2$ ). -
--timesteps: Number of denoising steps ($T$ in our paper). -
--num_noises: Size of each codebook ($K$ in our paper). -
--input_dir: Input directory (images to compress or binary files to decompress).
See the --help flag for more options and details.
The generated binary file name includes the compression metadata (e.g.,
Compression example:
python latent_compression.py compress \
--float16 \
--input_dir ./test_imgs \
--output_dir ./outputs \
--model_id "stabilityai/stable-diffusion-2-1-base" \
--num_noises 256 \
--timesteps 1000Decompression example:
python latent_compression.py decompress \
--float16 \
--input_dir ./compressed_binary_files/ \
--output_dir ./output_imgs_decompressedThis module can compress & perform posterior sampling for linear inverse problems (simultaneously), such as image super-resolution, colorization, and Gaussian blur. Internally, we use a pre-trained ImageNet diffusion model (see guided-diffusion).
Run restoration & compression / decompression:
python compressed_posterior_sampling.py restore|decompress [OPTIONS]You should specify the following arguments:
-
--input_path: Path to image or binary.binfile. -
--output_dir: Directory to save the output images or binary files. -
--task_config: Configuration file for the task (e.g.,colorization.yaml,gaussian_blur.yaml,super_resolution.yaml). -
--timesteps: Number of denoising steps ($T$ in our paper). -
--num_noises: Size of each codebook ($K$ in our paper). -
--eta: Denoising parameter ($\eta$ in our paper).
See the --help flag for more options and details.
The generated binary file name includes the compression metadata (e.g.,
Restoration & compression example for super-resolution:
python compressed_posterior_sampling.py restore \
--task_config super_resolution.yaml \
--input_path ./test_imgs \
--output_dir ./outputs \
--timesteps 1000 \
--num_noises 256 \
--eta 1.0Decompression example:
python compressed_posterior_sampling.py decompress \
--eta 1.0 \
--input_path "./compressed_binary_files/image_T=1000_K=256.bin" \
--output_dir ./output_imgs_decompressedThis module can restore & compress (simultaneously) real-world degraded face images. It supports both aligned (recommended) and unaligned face images of size
Run restoration & compression / decompression: Command:
python compressed_blind_face_restoration.py restore|decompress [OPTIONS]You should specify the following arguments:
-
--input_path: Path to image or binary.binfile. -
--output_path: Path prefix for saving the output. -
--num_noises: Size of each codebook ($K$ in our paper). -
--timesteps: Number of denoising steps ($T$ in our paper). -
--iqa_metric: IQA metric to optimize (niqe,clipiqa+,topiq_nr-face) -
--aligned: Flag if input face is aligned
See the --help flag for more options and details.
The generated binary file name includes the compression metadata (e.g.,
Restoration & compression example:
python compressed_blind_face_restoration.py restore \
--input_path ./aligned_degraded_face_img.jpg \
--output_path ./aligned_degraded_face_img_restored \
--aligned \
--num_noises 4096 \
--timesteps 1000 \
--iqa_metric "niqe"Decompression example:
python compressed_blind_face_restoration.py restore_and_compress \
--input_path ./aligned_degraded_face_img_restored.bin \
--output_path ./aligned_degraded_face_img_restoredComing soon.
This module enables text-guided image editing, where the resulting image is automatically compressed.
Run editing & compression / re-editing from a previously compressed image:
python compressed_textbased_editing.py edit|reedit [OPTIONS]You should specify the following arguments:
-
--input_dir: Input directory containing images to edit or binary files to re-edit. -
--output_dir: Output directory for saving edited images or binary files. -
--model_id: HuggingFace model ID for the diffusion model (e.g.,stabilityai/stable-diffusion-2-1-base). -
--num_noises: Size of each codebook ($K$ in our paper). -
--timesteps: Number of denoising steps ($T$ in our paper). -
--num_pursuit_noises: Number of matching-pursuit noises ($M$ in our paper). -
--num_pursuit_coef_bits: Number of bits for matching-pursuit coefficients ($C$ in our paper). -
--guidance_scale: Guidance scale for classifier-free guidance. -
--src_prompt: Source prompt describing the original input image. -
--dst_prompts: Target prompts describing the wanted edited images. -
--tskips: How many timesteps-selection steps to skip (e.g.,200 500 700). Smaller number yields stronger edits. Note that this number is dependent on how many timesteps are used.
See the --help flag for more options and details.
The generated binary file name includes the compression metadata (e.g.,
Editing & compressing example:
python compressed_textbased_editing.py edit \
--float16 \
--input_dir ./images_to_edit \
--output_dir ./outputs \
--model_id "stabilityai/stable-diffusion-2-1-base" \
--num_noises 8192 \
--num_pursuit_noises 6 \
--num_pursuit_coef_bits 1 \
--timesteps 1000 \
--guidance_scale 6.0 \
--src_prompt "a photo of a cat" \
--dst_prompts "a photo of a dog" "a photo of a lion" \
--tskips 500 650 800Re-editing & compressing example:
python compressed_textbased_editing.py reedit \
--float16 \
--input_dir ./compressed_binary_files \
--output_dir ./outputs \
--src_prompt "a photo of a cat" \
--dst_prompts "a photo of a tiger" \
--tskips 500 650 800We provide the code for additional experiements in the paper in the extras folder.
If you use this code for your research, please cite our paper:
@inproceedings{
ohayon2025compressed,
title = {Compressed Image Generation with Denoising Diffusion Codebook Models},
author = {Guy Ohayon and Hila Manor and Tomer Michaeli and Michael Elad},
booktitle = {Forty-second International Conference on Machine Learning},
year = {2025},
url = {https://openreview.net/forum?id=cQHwUckohW}
}
This project is released under the MIT license.
We borrowed codes from huggingface, guided diffusion, DPS, SwinIR, BasicSR, and DifFace. We thank the authors of these repositories for their useful implementations.
