Skip to content

PyNVML import slowdown? #6247

@mrocklin

Description

@mrocklin

I was looking through a flaky test report and saw this:

--------------------------- Subprocess stdout/stderr---------------------------
Traceback (most recent call last):
  File "/Users/runner/miniconda3/envs/dask-distributed/bin/dask-worker", line 33, in <module>
    sys.exit(load_entry_point('distributed', 'console_scripts', 'dask-worker')())
  File "/Users/runner/miniconda3/envs/dask-distributed/bin/dask-worker", line 25, in importlib_load_entry_point
    return next(matches).load()
  File "/Users/runner/miniconda3/envs/dask-distributed/lib/python3.8/importlib/metadata.py", line 77, in load
    module = import_module(match.group('module'))
  File "/Users/runner/miniconda3/envs/dask-distributed/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 843, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/Users/runner/work/distributed/distributed/distributed/__init__.py", line 10, in <module>
    from distributed.actor import Actor, ActorFuture, BaseActorFuture
  File "/Users/runner/work/distributed/distributed/distributed/actor.py", line 14, in <module>
    from distributed.client import Future
  File "/Users/runner/work/distributed/distributed/distributed/client.py", line 54, in <module>
    from distributed import cluster_dump, preloading
  File "/Users/runner/work/distributed/distributed/distributed/preloading.py", line 19, in <module>
    from distributed.core import Server
  File "/Users/runner/work/distributed/distributed/distributed/core.py", line 29, in <module>
    from distributed.comm import (
  File "/Users/runner/work/distributed/distributed/distributed/comm/__init__.py", line 46, in <module>
    _register_transports()
  File "/Users/runner/work/distributed/distributed/distributed/comm/__init__.py", line 41, in _register_transports
    from distributed.comm import ucx
  File "/Users/runner/work/distributed/distributed/distributed/comm/ucx.py", line 28, in <module>
    from distributed.diagnostics.nvml import has_cuda_context
  File "/Users/runner/work/distributed/distributed/distributed/diagnostics/nvml.py", line 7, in <module>
    import pynvml
  File "/Users/runner/miniconda3/envs/dask-distributed/lib/python3.8/site-packages/pynvml/__init__.py", line 1, in <module>
    from .nvml import *
  File "/Users/runner/miniconda3/envs/dask-distributed/lib/python3.8/site-packages/pynvml/nvml.py", line 33, in <module>
    from ctypes.util import find_library
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 914, in _find_spec
  File "<frozen importlib._bootstrap_external>", line 1407, in find_spec
  File "<frozen importlib._bootstrap_external>", line 1376, in _get_spec
  File "<frozen importlib._bootstrap_external>", line 1345, in _path_importer_cache
KeyboardInterrupt

This was in a test for a worker where the worker never came up. This happens in our test suite some times for various reasons. NVML is weird/suspicious enough that I thought I'd raise this to see if this could be an issue at all and maybe a cause for some flakiness. cc @quasiben @jakirkham @pentschev

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions