Skip to content

Conflict between ROCR_VISIBLE_DEVICES and HIP_VISIBLE_DEVICES environment variables causes Ray import error #53737

@2niuhe

Description

@2niuhe

What happened + What you expected to happen

Related PR: #51104

When both ROCR_VISIBLE_DEVICES=0 and HIP_VISIBLE_DEVICES=0 are set (as default settings added to .bashrc during ROCm installation), importing ray in Python 3.11.12 results in a RuntimeError indicating that HIP_VISIBLE_DEVICES should be used instead of ROCR_VISIBLE_DEVICES.

Expected Behavior:
Ray should:

  • Ignore ROCR_VISIBLE_DEVICES if HIP_VISIBLE_DEVICES is set, or
  • Log a warning (instead of raising an error) when both variables are present, indicating that ROCR_VISIBLE_DEVICES is deprecated.

Error Log:

Python 3.11.12 (main, Apr  9 2025, 04:04:00) [Clang 20.1.0 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ray
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nh/workshop/ray_demo/.venv/lib/python3.11/site-packages/ray/__init__.py", line 114, in <module>
    from ray._private.worker import (  # noqa: E402,F401
  File "/home/nh/workshop/ray_demo/.venv/lib/python3.11/site-packages/ray/_private/worker.py", line 1273, in <module>
    global_worker = Worker()
                    ^^^^^^^^
  File "/home/nh/workshop/ray_demo/.venv/lib/python3.11/site-packages/ray/_private/worker.py", line 448, in __init__
    ray._private.utils.get_visible_accelerator_ids()
  File "/home/nh/workshop/ray_demo/.venv/lib/python3.11/site-packages/ray/_private/utils.py", line 302, in get_visible_accelerator_ids
    return {
           ^
  File "/home/nh/workshop/ray_demo/.venv/lib/python3.11/site-packages/ray/_private/utils.py", line 305, in <dictcomp>
    ).get_current_process_visible_accelerator_ids()
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nh/workshop/ray_demo/.venv/lib/python3.11/site-packages/ray/_private/accelerators/amd_gpu.py", line 40, in get_current_process_visible_accelerator_ids
    raise RuntimeError(
RuntimeError: Please use HIP_VISIBLE_DEVICES instead of ROCR_VISIBLE_DEVICES

Versions / Dependencies

Ray: version 2.46.0
OS: ubuntu24.04LTS
python: 3.11

ROCk module version 6.12.12 is loaded
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.15
Runtime Ext Version:     1.7
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                             
Mwaitx:                  DISABLED
XNACK enabled:           NO
DMAbuf Support:          YES
VMM Support:             YES

==========               
HSA Agents               
==========               
*******                  
Agent 1                  
*******                  
  Name:                    AMD Ryzen 7 PRO 8845HS w/ Radeon 780M Graphics
  Uuid:                    CPU-XX                             
  Marketing Name:          AMD Ryzen 7 PRO 8845HS w/ Radeon 780M Graphics

Reproduction script

export ROCR_VISIBLE_DEVICES=0
export HIP_VISIBLE_DEVICES=0
python -c "import ray"

Issue Severity

Low: It annoys or frustrates me.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Important issue, but not time-criticalbugSomething that is supposed to be working; but isn'tcoreIssues that should be addressed in Ray Corestabilityusability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions