-
Notifications
You must be signed in to change notification settings - Fork 27.4k
Importing torch_tensorrt causes warning for implicitly cleaned up file #147744
Copy link
Copy link
Closed
Labels
oncall: distributedAdd this issue/PR to distributed oncall triage queueAdd this issue/PR to distributed oncall triage queue
Description
🐛 Describe the bug
A temporary directory is created at this line in torch.distributed.nn.jit.instantiator and it is never cleaned:
| _TEMP_DIR = tempfile.TemporaryDirectory() |
A warning is generated by tempfile itself when the program exits:
WARNING py.warnings /usr/lib/python3.12/tempfile.py:1075: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpxy_e0smt'> warnings.py:110
_warnings.warn(warn_message, ResourceWarning) The generated file is _remote_module_non_scriptable.py.
For me the warning message is generated when torch_tensorrt is imported:
-> import torch_tensorrt
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/__init__.py(125)<module>()
-> from torch_tensorrt.runtime import * # noqa: F403
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/runtime/__init__.py(1)<module>()
-> from torch_tensorrt.dynamo.runtime import ( # noqa: F401
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/dynamo/__init__.py(10)<module>()
-> from ._compiler import compile, convert_exported_program_to_serialized_trt_engine
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/dynamo/_compiler.py(14)<module>()
-> from torch_tensorrt.dynamo import _defaults, partitioning
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/dynamo/partitioning/__init__.py(1)<module>()
-> from ._adjacency_partitioner import partition as fast_partition
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/dynamo/partitioning/_adjacency_partitioner.py(20)<module>()
-> from torch_tensorrt.dynamo.conversion._ConverterRegistry import (
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/dynamo/conversion/__init__.py(1)<module>()
-> from . import aten_ops_converters, ops_evaluators, prims_ops_converters
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/dynamo/conversion/aten_ops_converters.py(12)<module>()
-> from torch_tensorrt.dynamo.conversion import impl
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/dynamo/conversion/impl/__init__.py(1)<module>()
-> from torch_tensorrt.fx.converters.impl import convolution
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/fx/__init__.py(1)<module>()
-> from .converters import * # noqa: F403 F401
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/fx/converters/__init__.py(5)<module>()
-> from .adaptive_avgpool import * # noqa: F401 F403
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/fx/converters/adaptive_avgpool.py(7)<module>()
-> from .converter_utils import extend_mod_attr_to_tuple, mark_as_int8_layer
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/fx/converters/converter_utils.py(23)<module>()
-> from ..utils import Frameworks, unified_dtype_converter
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/fx/utils.py(12)<module>()
-> from torch_tensorrt.fx.passes.lower_basic_pass import (
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/fx/passes/lower_basic_pass.py(14)<module>()
-> from ..tracer.acc_tracer import acc_ops
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch_tensorrt/fx/tracer/acc_tracer/acc_ops.py(891)<module>()
-> from torchvision.ops import stochastic_depth
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torchvision/__init__.py(10)<module>()
-> from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils # usort:skip
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torchvision/models/__init__.py(2)<module>()
-> from .convnext import *
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torchvision/models/convnext.py(8)<module>()
-> from ..ops.misc import Conv2dNormActivation, Permute
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torchvision/ops/__init__.py(23)<module>()
-> from .poolers import MultiScaleRoIAlign
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torchvision/ops/poolers.py(10)<module>()
-> from .roi_align import roi_align
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torchvision/ops/roi_align.py(7)<module>()
-> from torch._dynamo.utils import is_compile_supported
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/_dynamo/__init__.py(3)<module>()
-> from . import convert_frame, eval_frame, resume_execution
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/_dynamo/convert_frame.py(53)<module>()
-> from . import config, exc, trace_rules
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/_dynamo/trace_rules.py(46)<module>()
-> from .variables import (
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/_dynamo/variables/__init__.py(2)<module>()
-> from .builtin import BuiltinVariable
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/_dynamo/variables/builtin.py(47)<module>()
-> from .ctx_manager import EventVariable, StreamVariable
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/_dynamo/variables/ctx_manager.py(22)<module>()
-> from .functions import (
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/_dynamo/variables/functions.py(31)<module>()
-> from torch.distributed._composable.fsdp import _fsdp_param_group
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/_composable/__init__.py(3)<module>()
-> from .fully_shard import fully_shard
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/_composable/fully_shard.py(10)<module>()
-> from torch.distributed.fsdp._common_utils import _FSDPState
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/fsdp/__init__.py(1)<module>()
-> from ._flat_param import FlatParameter as FlatParameter
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/fsdp/_flat_param.py(47)<module>()
-> from ._fsdp_extensions import (
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/fsdp/_fsdp_extensions.py(6)<module>()
-> from torch.distributed._shard.sharded_tensor.api import ShardedTensor
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/_shard/__init__.py(1)<module>()
-> from .api import _shard_tensor, load_with_process_group, shard_module, shard_parameter
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/_shard/api.py(9)<module>()
-> from torch.distributed._shard.sharded_tensor import ShardedTensor
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/_shard/sharded_tensor/__init__.py(8)<module>()
-> from .api import (
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/_shard/sharded_tensor/api.py(31)<module>()
-> from .reshard import reshard_local_shard, reshuffle_local_shard
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/_shard/sharded_tensor/reshard.py(14)<module>()
-> from torch.distributed.nn.functional import all_to_all, all_to_all_single
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1310)_find_and_load_unlocked()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/nn/__init__.py(7)<module>()
-> from .api.remote_module import RemoteModule
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
/opt/uv/venv/lib/python3.12/site-packages/torch/distributed/nn/api/remote_module.py(26)<module>()
-> from torch.distributed.nn.jit import instantiator
<frozen importlib._bootstrap>(1415)_handle_fromlist()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
<frozen importlib._bootstrap>(1360)_find_and_load()
<frozen importlib._bootstrap>(1331)_find_and_load_unlocked()
<frozen importlib._bootstrap>(935)_load_unlocked()
<frozen importlib._bootstrap_external>(995)exec_module()
<frozen importlib._bootstrap>(488)_call_with_frames_removed()
> /opt/uv/venv/lib/python3.12/site-packages/torch/distributed/nn/jit/instantiator.py(17)<module>()
Versions
PyTorch version: 2.5.0+cu124
Is debug build: False
CUDA used to build PyTorch: 12.4
ROCM used to build PyTorch: N/A
OS: Ubuntu 24.04.1 LTS (x86_64)
GCC version: (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0
Clang version: Could not collect
CMake version: version 3.31.2
Libc version: glibc-2.39
Python version: 3.12.3 (main, Nov 6 2024, 18:32:19) [GCC 13.2.0] (64-bit runtime)
Python platform: Linux-6.8.0-51-generic-x86_64-with-glibc2.39
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4090
Nvidia driver version: 550.120
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 7900X 12-Core Processor
CPU family: 25
Model: 97
Thread(s) per core: 2
Core(s) per socket: 12
Socket(s): 1
Stepping: 2
CPU(s) scaling MHz: 65%
CPU max MHz: 5733.0000
CPU min MHz: 400.0000
BogoMIPS: 9399.26
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good amd_lbr_v2 nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba perfmon_v2 ibrs ibpb stibp ibrs_enhanced vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local user_shstk avx512_bf16 clzero irperf xsaveerptr rdpru wbnoinvd cppc arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif x2avic v_spec_ctrl vnmi avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid overflow_recov succor smca fsrm flush_l1d
Virtualization: AMD-V
L1d cache: 384 KiB (12 instances)
L1i cache: 384 KiB (12 instances)
L2 cache: 12 MiB (12 instances)
L3 cache: 64 MiB (2 instances)
NUMA node(s): 1
NUMA node0 CPU(s): 0-23
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Reg file data sampling: Not affected
Vulnerability Retbleed: Not affected
Vulnerability Spec rstack overflow: Vulnerable: Safe RET, no microcode
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] nvidia-cublas-cu12==12.4.5.8
[pip3] nvidia-cuda-cupti-cu12==12.4.127
[pip3] nvidia-cuda-nvrtc-cu12==12.4.127
[pip3] nvidia-cuda-runtime-cu12==12.4.127
[pip3] nvidia-cudnn-cu12==9.1.0.70
[pip3] nvidia-cufft-cu12==11.2.1.3
[pip3] nvidia-curand-cu12==10.3.5.147
[pip3] nvidia-cusolver-cu12==11.6.1.9
[pip3] nvidia-cusparse-cu12==12.3.1.170
[pip3] nvidia-nccl-cu12==2.21.5
[pip3] nvidia-nvjitlink-cu12==12.4.127
[pip3] nvidia-nvtx-cu12==12.4.127
[pip3] onnx==1.17.0
[pip3] onnx-graphsurgeon==0.5.2
[pip3] onnxconverter-common==1.14.0
[pip3] onnxmltools==1.12.0
[pip3] onnxruntime==1.18.1
[pip3] onnxruntime-gpu==1.18.1
[pip3] onnxscript==0.1.0.dev20241212
[pip3] torch==2.5.0+cu124
[pip3] torch_tensorrt==2.5.0+cu124
[pip3] torchinfo==1.8.0
[pip3] torchprofile==0.0.4
[pip3] torchsummary==1.5.1
[pip3] torchvision==0.20.0+cu124
[pip3] triton==3.1.0
[conda] Could not collect
cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
oncall: distributedAdd this issue/PR to distributed oncall triage queueAdd this issue/PR to distributed oncall triage queue