🌟 [CVPR 2026 Highlight🔥] Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers

📑 Introduction

Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers

Wongi Jeong*, Kyungryeol Lee*, Hoigi Seo, Se Young Chun (*co-first)

📚arXiv

This paper proposes Region-Adaptive Latent Upsampling (RALU), a training-free framework for accelerating Diffusion Transformers along the spatial dimension. RALU selectively upsamples only edge-sensitive regions during denoising to suppress artifacts, while preserving the model’s semantic fidelity and visual quality. It further introduces a noise-timestep rescheduling strategy to ensure stable generation across resolution transitions, making it compatible with temporal acceleration methods.

🔥 Recent News

[2026.04.09] RALU is selected as a Highlight paper at CVPR 2026 !! 🔥🔥
[2026.02.20] RALU is accepted at CVPR 2026 !!
[2025.08.07] RALU code has been released.
[2025.07.11] RALU is on arXiv.

🚀 Usage

Environment Setup

Make sure your environment is capable of running FLUX. Only a few additional packages need to be installed.

Configure Parameters
- use_RALU_default: Use the predefined configurations (4× or 7× speedup) as described in the RALU paper.
- level: When using --use_RALU_default, specify the desired acceleration level (either 4 or 7).
- N: A list of denoising step counts for each of the three stages.
- e: A list of end timesteps for each stage. The last value must be 1.0, as it denotes the final timestep.
- up_ratio: The ratio of tokens to be early upsampled in Stage 2.
Run the Example

Execute the RALU_inference.py script.

Option 1: Using the default RALU setting (4× or 7× speedup)
```
python RALU_inference.py --use_RALU_default --level 4
```
Option 2: Using custom N and e values
```
python RALU_inference.py --N 4 5 6 --e 0.3 0.45 1.0   # for N=[4, 5, 6], e=[0.3, 0.45, 1.0]
```
Note: The last value of e must always be 1.0, indicating the end of the diffusion process.

📸 Example Outputs

The images below compare the results of applying 4× and 7× acceleration using naive reduction of num_inference_steps in FLUX.1-dev vs. using RALU with the same speedup factors.

🙏 Acknowledgments

This code is based on the FLUX pipeline implementation provided by Diffusers. The referenced works are as follows:

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
figs		figs
RALU_NTDM.py		RALU_NTDM.py
RALU_inference.py		RALU_inference.py
README.md		README.md
pipeline_flux_RALU.py		pipeline_flux_RALU.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌟 [CVPR 2026 Highlight🔥] Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers

📑 Introduction

🔥 Recent News

🚀 Usage

📸 Example Outputs

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🌟 [CVPR 2026 Highlight🔥] Training-free Mixed-Resolution Latent Upsampling for Spatially Accelerated Diffusion Transformers

📑 Introduction

🔥 Recent News

🚀 Usage

📸 Example Outputs

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages