📝 Paper | 🏆 ACM Multimedia 2025
Official PyTorch implementation of "Enhancing Diffusion Model Stability for Image Restoration via Gradient Management".
-
Novel Gradient-Centric Analysis: The first to identify and analyze the dual problems of gradient conflict and gradient fluctuation as key sources of instability in diffusion-based image restoration.
-
Innovative Gradient Management (SPGD): A new technique that directly mitigates these instabilities through two synergistic components:
- Progressive Likelihood Warm-Up to resolve conflicts before denoising.
- Adaptive Directional Momentum (ADM) to dampen erratic gradient fluctuations.
-
Plug-and-Play Compatibility: Easily integrated into existing diffusion posterior sampling frameworks without requiring any changes to the underlying model architecture or retraining.
Prominent diffusion-based image-restoration methods typically frame the conditional generation within a Bayesian inference framework, which iteratively combines a denoising step with a likelihood guidance step. However, the interactions between these two components in the generation process remain underexplored.
In this study, we analyze the underlying gradient dynamics of these components and identify significant instabilities. Specifically, we demonstrate conflicts between the prior and likelihood gradient directions, alongside temporal fluctuations in the likelihood gradient itself.
Figure 1: Angular relationships of gradients during the reverse process across different tasks, illustrating the gradient angles dynamics. Blue lines: angle between g_l and g_d. Orange lines: angle between g_l(x_t) and g_l(x_{t-1}). Green lines: angle between g_d(x_t) and g_d(x_{t-1}).
We show that these instabilities disrupt the generative process and compromise restoration performance. To address these issues, we propose Stabilized Progressive Gradient Diffusion (SPGD), a novel gradient management technique. SPGD integrates two synergistic components:
- Progressive likelihood warm-up strategy to mitigate gradient conflicts
- Adaptive directional momentum (ADM) smoothing to reduce fluctuations in the likelihood gradient
Figure 2: High-level illustration of our proposed SPGD. (a) the standard reverse process, and (b) our proposed SPGD, showing the warm-up phase with smoothed gradient to enhance restoration stability.
- python 3.8
- PyTorch 2.3
- CUDA 12.1
# Clone the repository
git clone https://github.com/74587887/SPGD.git
cd SPGD
# Create conda environment
conda create -n SPGD python=3.8
conda activate SPGD
# Install PyTorch (adjust CUDA version as needed)
conda install pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=12.1 -c pytorch -c nvidia
# Install other dependencies
pip install -r requirements.txtNote: Lower PyTorch versions with proper CUDA should work but are not fully tested.
Download the pretrained diffusion models from DPS repository:
mkdir checkpoints
# Download and place checkpoints
# ffhq_10m.pt -> checkpoints/ffhq256.pt
# imagenet256.pt -> checkpoints/imagenet256.ptOptional: For nonlinear deblur tasks, download the BKSE model from here:
mkdir -p forward_operator/bkse/experiments/pretrained
# Place GOPRO_wVAE.pth in the above directory# Create results directory
mkdir results
# Quick test with demo images (included in repo)
python posterior_sample.py \
+data=demo \
+model=ffhq256ddpm \
+task=inpainting_rand \
+sampler=edm_spgd \
save_dir=results \
num_runs=1 \
sampler.annealing_scheduler_config.num_steps=30 \
batch_size=5 \
data.start_id=0 data.end_id=10 \
name=quick_test \
gpu=0This should quickly generate restored images in results/quick_test/.
python posterior_sample.py \
+data={DATASET_CONFIG} \
+model={MODEL_CONFIG} \
+task={TASK_CONFIG} \
+sampler=edm_spgd \
save_dir=results \
num_runs=1 \
sampler.annealing_scheduler_config.num_steps=100 \
batch_size=batch_size \
data.start_id=0 data.end_id=100 \
name={EXPERIMENT_NAME} \
gpu={GPU_ID}Model Configs ({MODEL_CONFIG}):
ffhq256ddpm- For FFHQ datasetimagenet256ddpm- For ImageNet dataset
Task Configs ({TASK_CONFIG}):
down_sampling- Super-resolution (4× upscaling)inpainting- 128×128 box inpaintinginpainting_rand- 80% random pixel inpaintinggaussian_blur- Gaussian deblur (kernel size 61, intensity 3.0)motion_blur- Motion deblur (kernel size 61, intensity 0.5), from MotionBlur codebasephase_retrieval- Phase retrieval (oversample ratio 2.0)nonlinear_blur- Nonlinear deblur (requires BKSE model)hdr- High dynamic range reconstruction (factor 2)
Dataset Configs ({DATASET_CONFIG}):
demo- Use provided demo imagesffhq256- FFHQ-256 dataset (requires manual download)imagenet256- ImageNet-256 dataset (requires manual download)
You can modify SPGD's hyperparameters by editing the task configuration files:
# Edit configs/task/{TASK_CONFIG}
config:
lr: lr # Learning rate for likelihood guidance (typically 0.5-10)
num_steps: N # Number of warm-up steps (typically 1-10)
Common Issues:
- CUDA out of memory: Reduce
batch_sizeor use a GPU with more VRAM - Raising ValueError: Consider tuning (more likely, lowering) the learning rate
Performance Tips:
- Use
batch_size=1for testing, larger batches for production - Start with a small
lr, then consider increasing it
Our implementation is based on and inspired by:
- DAPS: https://github.com/zhangbingliang2019/DAPS
- DPS: https://github.com/DPS2022/diffusion-posterior-sampling.
- BKSE for nonlinear blur operators
- MotionBlur for motion blur synthesis
We gratefully acknowledge their contributions to the field.
If you find our work useful, please kindly consider citing:
@inproceedings{wu2025enhancing,
title={Enhancing Diffusion Model Stability for Image Restoration via Gradient Management},
author={Wu, Hongjie and Zhang, Mingqin and He, Linchao and Zhou, Ji-Zhe and Lv, Jiancheng},
booktitle={Proceedings of the 33rd ACM International Conference on Multimedia},
pages={10768--10777},
year={2025}
}