-
Notifications
You must be signed in to change notification settings - Fork 7.2k
Description
#8552 migrated all 3.8 jobs to 3.9. It took a while, and required a bunch of fixes. To avoid blocking it indefinitely, the PR was merged while Windows CUDA unittests jobs were still failing #8552 (comment).
So, the Windows CUDA unittests jobs are failing. And I don't know why.
logs: https://github.com/pytorch/vision/actions/runs/10699721178/job/29661914922?pr=8623
File "C:\actions-runner\_work\vision\vision\pytorch\vision\test\smoke_test.py", line 113, in <module>
main()
File "C:\actions-runner\_work\vision\vision\pytorch\vision\test\smoke_test.py", line 85, in main
print(f"{torch.ops.image._jpeg_version() = }")
File "C:\Jenkins\Miniconda3\envs\ci\lib\site-packages\torch\_ops.py", line 1225, in __getattr__
raise AttributeError(
AttributeError: '_OpNamespace' 'image' object has no attribute '_jpeg_version'
More detailed error (modifying the import code to be more verbose):
torchvision\io\__init__.py:24: in <module>
from .image import (
torchvision\io\image.py:11: in <module>
_load_library("image")
torchvision\extension.py:89: in _load_library
torch.ops.load_library(lib_path)
C:\Jenkins\Miniconda3\envs\ci\lib\site-packages\torch\_ops.py:1350: in load_library
ctypes.CDLL(path)
C:\Jenkins\Miniconda3\envs\ci\lib\ctypes\__init__.py:374: in __init__
self._handle = _dlopen(self._name, mode)
E FileNotFoundError: Could not find module 'C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\image.pyd' (or one of its dependencies). Try using the full path with c
onstructor syntax.
Checking the dependencies of image.pyd gives:
runneruser@EC2AMAZ-HAC74MP /c/actions-runner/_work/vision/vision/pytorch/vision ((efd36a4...))
$ cygcheck.exe torchvision/image.pyd
C:\actions-runner\_work\vision\vision\pytorch\vision\torchvision\image.pyd C:\Jenkins\Miniconda3\envs\ci\Library\bin\libpng16.dll
C:\Jenkins\Miniconda3\envs\ci\zlib.dll C:\Windows\system32\VCRUNTIME140.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-runtime-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-heap-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-string-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-stdio-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-convert-l1-1-0.dll C:\Windows\system32\KERNEL32.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-rtlsupport-l1-1-0.dll C:\Windows\system32\ntdll.dll
C:\Windows\system32\KERNELBASE.dll C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\api-ms-win-eventing-provider-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-processthreads-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-processthreads-l1-1-1.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-heap-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-memory-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-handle-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-synch-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-synch-l1-2-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-file-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-file-l1-2-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-namedpipe-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-datetime-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-sysinfo-l1-2-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-sysinfo-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-timezone-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-localization-l1-2-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-processenvironment-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-string-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-debug-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-errorhandling-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-fibers-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-util-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-profile-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-file-l2-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-console-l1-1-0.dll C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-console-l1-2-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-math-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-filesystem-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-time-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\jpeg8.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-environment-l1-1-0.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libwebp.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libsharpyuv.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-utility-l1-1-0.dll
cygcheck: track_down: could not find nvjpeg64_11.dll
cygcheck: track_down: could not find c10.dll
cygcheck: track_down: could not find torch_cpu.dll
cygcheck: track_down: could not find cudart64_110.dll
cygcheck: track_down: could not find c10_cuda.dll
cygcheck: track_down: could not find torch_cuda.dll
C:\Windows\system32\MSVCP140.dll
C:\Windows\system32\VCRUNTIME140_1.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-locale-l1-1-0.dll
So it seems like nvjpeg and other cuda dependencies cannot be found. In b4c05786e6a7f8f6e1a01d3f9c7ccaf7de1c6830 I removed building with nvjpeg support, and could confirm that the import failure wasn't there anymore.
This seems to suggest that adding "/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.8/bin" to the PATH could prevent the problem. But I just tried that by ssh-ing on the machine, and I'm still getting the same error (and I confirm the PATH was OK by running cygcheck again, and confirmed that nvjpeg64_11.dll was found:
...
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libwebp.dll
C:\Jenkins\Miniconda3\envs\ci\Library\bin\libsharpyuv.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-utility-l1-1-0.dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\nvjpeg64_11.dll
cygcheck: track_down: could not find c10.dll
cygcheck: track_down: could not find torch_cpu.dll
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin\cudart64_110.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-core-interlocked-l1-1-0.dll
cygcheck: track_down: could not find c10_cuda.dll
cygcheck: track_down: could not find torch_cuda.dll
C:\Windows\system32\MSVCP140.dll
C:\Windows\system32\VCRUNTIME140_1.dll
C:\Jenkins\Miniconda3\envs\ci\api-ms-win-crt-locale-l1-1-0.dll