This repository was archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.7k
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
Windows segmentation faults in GPU tests #17635
Copy link
Copy link
Open
Labels
Description
Description
Windows GPU tests from the updated environment in https://github.com/aiengines/ci fails with the following:
======================================================================
ERROR: Failure: OSError (exception: access violation writing 0x0000000000000000)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\nose\failure.py", line 39, in runTest
raise self.exc_val.with_traceback(self.tb)
File "C:\Python37\lib\site-packages\nose\loader.py", line 418, in loadTestsFromName
addr.filename, addr.module)
File "C:\Python37\lib\site-packages\nose\importer.py", line 47, in importFromPath
return self.importFromDir(dir_path, fqname)
File "C:\Python37\lib\site-packages\nose\importer.py", line 94, in importFromDir
mod = load_module(part_fqname, fh, filename, desc)
File "C:\Python37\lib\imp.py", line 235, in load_module
return load_source(name, filename, file)
File "C:\Python37\lib\imp.py", line 172, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 696, in _load
File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 728, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "C:\Users\Administrator\mxnet\tests\python\unittest\test_test_utils.py", line 21, in <module>
import mxnet as mx
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\__init__.py", line 33, in <module>
from . import contrib
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\contrib\__init__.py", line 27, in <module>
from . import autograd
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\contrib\autograd.py", line 27, in <module>
from ..ndarray import NDArray, zeros_like, _GRAD_REQ_MAP
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\ndarray\__init__.py", line 20, in <module>
from . import _internal, contrib, linalg, op, random, sparse, utils, image, ndarray, numpy
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\ndarray\numpy\__init__.py", line 23, in <module>
from . import _register
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\ndarray\numpy\_register.py", line 21, in <module>
from ..register import _make_ndarray_function
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\ndarray\register.py", line 277, in <module>
_init_op_module('mxnet', 'ndarray', _make_ndarray_function)
File "C:\Users\Administrator\mxnet\windows_package\python\mxnet\base.py", line 682, in _init_op_module
ctypes.byref(plist)))
OSError: exception: access violation writing 0x0000000000000000
======================================================================
ERROR: Failure: OSError (exception: access violation writing 0x0000000000000000)
----------------------------------------------------------------------
Traceback (most recent call last):
Error Message
(Paste the complete error message. Please also include stack trace by setting environment variable DMLC_LOG_STACK_TRACE_DEPTH=10 before running your script.)
To Reproduce
Create an AMI with the provided scripts, compile and run GPU tests.
Steps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?
Environment
We recommend using our script for collecting the diagnositc information. Run the following command and paste the outputs below:
curl --retry 10 -s https://raw.githubusercontent.com/dmlc/gluon-nlp/master/tools/diagnose.py | python
# paste outputs here