Skip to content

[Inductor] Inconsistent handling of negative indices in index_select: CPU/CUDA/MPS #169779

@lingebeng

Description

@lingebeng

🐛 Describe the bug

Device Eager Mode Inductor Mode
CPU Error (RuntimeError) ⚠️ Silent Pass (Dangerous)
CUDA Error (Device-side Assert) ⚠️ Silent Pass (Dangerous)
MPS ⚠️ Silent Pass (Dangerous) ⚠️ Silent Pass (Dangerous)

Reproduce script

import os

# os.environ["TORCH_LOGS"] = "output_code"
import torch

device = "cpu"

# device = "cuda"
# device = "mps"


def fn(x):
    indices = torch.tensor([-1], dtype=torch.int64).to(device)
    return torch.index_select(x, 0, indices)


x = torch.randn(5, 10).to(device)


print("--- Eager Run ---")
try:
    res = fn(x)
    res = res.cpu()
    print("Eager executed successfully")
except Exception as e:
    print(f"Eager Failed: {e}")

print("--- Inductor Run ---")
opt_fn = torch.compile(fn, backend="inductor", dynamic=True)
try:
    res = opt_fn(x)
    res = res.cpu()
    print("Inductor executed successfully")
except Exception as e:
    print(f"Inductor Failed: {e}")

On CPU

--- Eager Run ---
Eager Failed: index out of range in self
--- Inductor Run ---
Inductor executed successfully

On CUDA

--- Eager Run ---
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [0,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [1,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [2,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [3,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [4,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [5,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [6,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [7,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [8,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
/pytorch/aten/src/ATen/native/cuda/Indexing.cu:1478: indexSelectSmallIndex: block: [0,0,0], thread: [9,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Eager Failed: CUDA error: device-side assert triggered
Search for `cudaErrorAssert' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

single run

--- Inductor Run ---
Inductor executed successfully

On MPS

--- Eager Run ---
Eager executed successfully
--- Inductor Run ---
Inductor executed successfully

Versions

PyTorch version: 2.10.0.dev20251205
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 26.1 (arm64)
GCC version: Could not collect
Clang version: 17.0.0 (clang-1700.4.4.1)
CMake version: Could not collect
Libc version: N/A

Python version: 3.12.12 (main, Oct 28 2025, 11:52:25) [Clang 20.1.4 ] (64-bit runtime)
Python platform: macOS-26.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M4

Versions of relevant libraries:
[pip3] Could not collect
[conda] Could not collect

cc @albanD @chauhang @penguinwu @malfet

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: decompositionsTopics related to decomposition (excluding PrimTorch)module: python frontendFor issues relating to PyTorch's Python frontendoncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions