Skip to content

[Issue]: [MIOpen - gfx1200/Windows] First SD generation at VAE stage is extremely slow and crashes GPU driver - even with AOTriton enabled #1542

@Nem404

Description

@Nem404

Note
Also reported at ROCm/rocm-libraries#1860

Problem Description

Using a gfx1200 GPU, the first image generation in Stable Diffusion goes quickly - I get like 2.5 it/s - but at the end, during the VAE decode stage, it crashes the GPU driver, occasionally giving OOM errors in the console.

I’m using average/default generation parameters - basically every UI’s base values (1024×1024 resolution, 20 steps, Euler a or DPM++ 2M karras, etc.).

I’ve tried the well-known UIs:
ComfyUI, SD.Next, Stable Diffusion WebUI reForge, etc., and they all behave the same.

Subsequent generations usually work, but if I change the resolution to anything else, the problem repeats.
For example, using the krita-ai-diffusion plugin with Krita triggers the same issue every single time, because there the resolution and other parameters often change. This doesn’t seem reasonable.

I’ve tried every flag I could think of, for example in Comfy:
--use-pytorch-cross-attention, --disable-smart-memory, --reserve-vram 8, --fp16-vae, --bf16-vae, tiled VAE node, etc., but nothing helps.

/ I figured out, along with other users, that disabling MIOpen entirely by hard-coding torch.backends.cudnn.enabled = False in the script generally prevents driver crashes and OOM issues, but it’s just a workaround, not a real solution. /

Operating System

Windows 11

CPU

Intel Core i5

GPU

AMD Radeon RX 9060 XT 16 GB

ROCm Version

7.0.0rc20250917

Steps to Reproduce

Install the latest wheels:
python -m pip install --index-url https://rocm.nightlies.amd.com/v2/gfx120X-all/ torch torchvision torchaudio

Then open any SD UI, and generate an image.

(I'm using the TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1 environment variable every time to enable AOTriton on the gfx1200.)

Additional Information

I’ve seen other AMD users mention this VAE issue in several other places online.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

TODO

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions