Skip to content

Variables are assigned different names on different environment #1780

@zasdfgbnm

Description

@zasdfgbnm

🐛 Describe the bug

Thanks @jjsjann123 for reporting.

TEST_F(NVFuserTest, TMP) {
  auto fusion = std::make_unique<Fusion>();
  FusionGuard fg(fusion.get());

  auto tv0 = makeSymbolicTensor(3);
  fusion->addInput(tv0);

  auto tv1 = sum(tv0, {0});
  auto tv2 = neg(tv1);

  fusion->addOutput(tv2);
  fusion->print();
}

The above code gives

%kernel {
T1_l[ rS3{i0}, iS4{i2}, iS5{i3} ]
   = reduction( T0_g[ iS0{i0}, iS1{i2}, iS2{i3} ], op = add, initial value = double(0), allreduce = 0 )
T2_g[ iS6{i2}, iS7{i3} ]
   = -T1_l[ rS3{i0}, iS4{i2}, iS5{i3} ];

TransformPrinter : 
T0_g[ iS0{i0}, iS1{i2}, iS2{i3} ]
 root domain : (iS0{i0},iS1{i2},iS2{i3})
T1_l[ rS3{i0}, iS4{i2}, iS5{i3} ]
 root domain : (rS3{i0},iS4{i2},iS5{i3})
T2_g[ iS6{i2}, iS7{i3} ]
 root domain : (iS6{i2},iS7{i3})
}

in gitlab-master.nvidia.com:5005/dl/pytorch/update-scripts:jit-cuda11-latest
and it gives

%kernel {
T1_l[ rS3{i1}, iS4{i2}, iS5{i3} ]
   = reduction( T0_g[ iS0{i1}, iS1{i2}, iS2{i3} ], op = add, initial value = double(0), allreduce = 0 )
T2_g[ iS6{i2}, iS7{i3} ]
   = -T1_l[ rS3{i1}, iS4{i2}, iS5{i3} ];

TransformPrinter : 
T0_g[ iS0{i1}, iS1{i2}, iS2{i3} ]
 root domain : (iS0{i1},iS1{i2},iS2{i3})
T1_l[ rS3{i1}, iS4{i2}, iS5{i3} ]
 root domain : (rS3{i1},iS4{i2},iS5{i3})
T2_g[ iS6{i2}, iS7{i3} ]
 root domain : (iS6{i2},iS7{i3})
}

on my local machine.

Pay attention to the iS0{i1} vs iS0{i0}.

Both environments are using TOT of the devel branch.

Although I don't see any real issue with having different variable names, but it sounds to me that, if this behavior depends on the environment, then we must be programming UB somewhere.

Versions

My local environment:

Collecting environment information...
PyTorch version: 1.13.0a0+gita054b3e
Is debug build: False
CUDA used to build PyTorch: 11.7
ROCM used to build PyTorch: N/A

OS: Arch Linux (x86_64)
GCC version: (GCC) 12.1.0
Clang version: 13.0.1
CMake version: version 3.23.2
Libc version: glibc-2.35

Python version: 3.10.5 (main, Jun  6 2022, 18:49:26) [GCC 12.1.0] (64-bit runtime)
Python platform: Linux-5.18.5-arch1-1-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.7.64
GPU models and configuration: 
GPU 0: NVIDIA GeForce RTX 3090
GPU 1: NVIDIA GeForce RTX 2080 Ti

Nvidia driver version: 515.48.07
cuDNN version: Probably one of the following:
/usr/lib/libcudnn.so.8.4.0
/usr/lib/libcudnn_adv_infer.so.8.4.0
/usr/lib/libcudnn_adv_train.so.8.4.0
/usr/lib/libcudnn_cnn_infer.so.8.4.0
/usr/lib/libcudnn_cnn_train.so.8.4.0
/usr/lib/libcudnn_ops_infer.so.8.4.0
/usr/lib/libcudnn_ops_train.so.8.4.0
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] numpy==1.22.3
[pip3] torch==1.13.0a0+gitc270fd4
[pip3] torch-ucc==1.0.0
[pip3] torchani==2.2
[pip3] torchvision==0.2.2.post3
[conda] Could not collect

Note that

GCC version: (GCC) 12.1.0
Clang version: 13.0.1

are both very new, and they might not have been heavily tested by the world yet. So not sure if this could be the source of the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions