Skipping import of cpp extensions due to incompatible torch version 2.12.0.dev20260315+cu128 for torchao version 0.17.0.dev20260316+cu128 Please see https://github.com/pytorch/ao/issues/2919 for more info
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/dtypes/utils.py:89: UserWarning: Deprecation: PlainLayout is deprecated and will be removed in a future release of torchao, see https://github.com/pytorch/ao/issues/2752 for more details
warnings.warn(
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/float8/float8_training_tensor.py:122: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/float8/float8_training_tensor.py:195: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/float8/float8_scaling_utils.py:90: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/float8/float8_linear.py:28: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/dtypes/nf4tensor.py:1176: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention_processor.py:51: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
2026-03-16 12:13:54.540875: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/keras/src/export/tf2onnx_lib.py:8: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
if not hasattr(np, "object"):
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/prototype/mx_formats/mx_tensor.py:546: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/prototype/mx_formats/mx_tensor.py:604: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/transformers/modeling_utils.py:1984: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@torch._dynamo.allow_in_graph
/fsx/sayak/diffusers/src/diffusers/hooks/context_parallel.py:340: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention.py:537: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention.py:579: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention.py:751: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention.py:1132: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention.py:1334: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/auraflow_transformer_2d.py:146: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/auraflow_transformer_2d.py:196: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/cogvideox_transformer_3d.py:37: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/consisid_transformer_3d.py:231: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/hunyuan_transformer_2d.py:56: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/stable_audio_transformer.py:64: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_allegro.py:35: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention_dispatch.py:1770: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/attention_dispatch.py:1804: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_bria.py:365: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_bria.py:450: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_bria_fibo.py:243: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_bria_fibo.py:306: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_flux.py:355: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_flux.py:409: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_chroma.py:203: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_chroma.py:275: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_chronoedit.py:433: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_cogview4.py:455: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_easyanimate.py:208: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_glm_image.py:350: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_helios.py:375: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hidream_image.py:139: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hidream_image.py:420: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hidream_image.py:489: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hunyuanimage.py:266: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hunyuanimage.py:460: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hunyuanimage.py:537: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_longcat_image.py:214: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_longcat_image.py:268: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_ltx.py:281: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_ltx.py:384: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_mochi.py:118: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_mochi.py:308: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_ovis_image.py:214: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_ovis_image.py:271: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_qwenimage.py:577: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_sd3.py:38: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_skyreels_v2.py:438: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_wan.py:419: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_z_image.py:183: FutureWarning: torch._dynamo.allow_in_graph is deprecated and will be removed in a future version. Use torch._dynamo.nonstrict_trace instead.
@maybe_allow_in_graph
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]
Loading pipeline components...: 29%|██▊ | 2/7 [00:00<00:01, 3.16it/s]
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]�[A
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 80.37it/s]
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 86%|████████▌ | 6/7 [00:00<00:00, 7.71it/s]
Loading checkpoint shards: 0%| | 0/3 [00:00<?, ?it/s]�[A/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/dtypes/utils.py:89: UserWarning: Deprecation: PlainLayout is deprecated and will be removed in a future release of torchao, see https://github.com/pytorch/ao/issues/2752 for more details
warnings.warn(
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/dtypes/uintx/semi_sparse_layout.py:84: UserWarning: Deprecation: SemiSparseLayout is deprecated and will be removed in a future release of torchao, see https://github.com/pytorch/ao/issues/2752 for more details
warnings.warn(
/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/dtypes/uintx/tensor_core_tiled_layout.py:193: UserWarning: Deprecation: TensorCoreTiledLayout is deprecated and will be removed in a future release of torchao, see https://github.com/pytorch/ao/issues/2752 for more details
warnings.warn(
Loading checkpoint shards: 33%|███▎ | 1/3 [00:03<00:07, 3.59s/it]�[A
Loading checkpoint shards: 67%|██████▋ | 2/3 [00:07<00:03, 3.60s/it]�[A
Loading checkpoint shards: 100%|██████████| 3/3 [00:08<00:00, 2.60s/it]�[A
Loading checkpoint shards: 100%|██████████| 3/3 [00:08<00:00, 2.87s/it]
Loading pipeline components...: 100%|██████████| 7/7 [00:09<00:00, 1.41s/it]
0%| | 0/4 [00:00<?, ?it/s]
0%| | 0/4 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/fsx/sayak/diffusers/torchao/check.py", line 64, in <module>
_ = pipe("a dog", num_inference_steps=4)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 124, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/diffusers/src/diffusers/pipelines/flux/pipeline_flux.py", line 949, in __call__
noise_pred = self.transformer(
^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/diffusers/src/diffusers/hooks/hooks.py", line 189, in new_forward
output = function_reference.forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/diffusers/src/diffusers/hooks/hooks.py", line 189, in new_forward
output = function_reference.forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/diffusers/src/diffusers/utils/peft_utils.py", line 315, in wrapper
result = forward_fn(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_flux.py", line 680, in forward
hidden_states = self.x_embedder(hidden_states)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1778, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1789, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/diffusers/src/diffusers/hooks/hooks.py", line 189, in new_forward
output = function_reference.forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/diffusers/src/diffusers/hooks/hooks.py", line 189, in new_forward
output = function_reference.forward(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torch/nn/modules/linear.py", line 134, in forward
return F.linear(input, self.weight, self.bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/utils.py", line 662, in _dispatch__torch_function__
return cls._TORCH_FN_TABLE[cls][func](func, types, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/utils.py", line 465, in wrapper
return _func(f, types, args, kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.12/site-packages/torchao/quantization/quantize_/workflows/int8/int8_tensor.py", line 318, in _
m = torch.mm(
^^^^^^^^^
RuntimeError: Expected all tensors to be on the same device, but got mat2 is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA_mm)
Overlapped offloading is when you overlap data transfer with compute. In the context of a model, say you offloaded the parameter dict to CPU. While performing computation for the current layer, you fetch the params for the next layer in a separate CUDA stream, thereby not introducing any overhead (hopefully).
The code here (has to be run with
--group-offloadand--quantize) is failing with the following. It doesn't happen when quantization is disabled. Is this expected?Error trace:
Unfold
Versions:
Cc: @jerryzh168