Skip to content

[ppc64le pytorch] test_nn failure test_Softmax2d (test_nn.TestNN) ... pure virtual method called #7615

@avmgithub

Description

@avmgithub

when running test_nn.py test_Softmax2d fails

test_Softmax2d (test_nn.TestNN) ... pure virtual method called
terminate called without an active exception
Traceback (most recent call last):
File "run_test.py", line 342, in
main()
File "run_test.py", line 334, in main
raise RuntimeError(message)
RuntimeError: test_nn failed! Received signal: SIGIOT

When running it multiple times using
$ python -m unittest test_nn.TestNN.test_Softmax2d -v

sometimes it succeeds, but most of the time it fails with the above failure
When I compile with DEBUG=1 , it seems to succeed more than it fails

PyTorch version: 0.5.0a0+6eec411
Is debug build: No
CUDA used to build PyTorch: 9.0.176

OS: Red Hat Enterprise Linux Server 7.4 (Maipo) ppc64le Power8
GCC version: (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
CMake version: 3.6.3

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration:
GPU 0: Tesla P100-SXM2-16GB
GPU 1: Tesla P100-SXM2-16GB

Nvidia driver version: 384.81
cuDNN version: Probably one of the following:
/usr/local/cuda-9.0/targets/ppc64le-linux/lib/libcudnn.so
/usr/local/cuda-9.0/targets/ppc64le-linux/lib/libcudnn.so.7
/usr/local/cuda-9.0/targets/ppc64le-linux/lib/libcudnn.so.7.0.4
/usr/local/cuda-9.0/targets/ppc64le-linux/lib/libcudnn_static.a

Versions of relevant libraries:
[pip3] numpy (1.14.0)
[conda] torch 0.5.0a0+6eec411

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions