Skip to content

[RFC] Not setting CUDA_VISIBLE_DEVICES in the case where num_gpus is not set #54868

@jjyao

Description

@jjyao

Currently, for a Ray task/actor without specifying num_gpus, Ray will set CUDA_VISIBLE_DEVICES env var to empty string to indicate that the task/actor is not using GPUs. However this has caused issues to framework like vllm that treats CUDA_VISIBLE_DEVICES="" as "hardward support is disabled" and raise exceptions.

We are now proposing to change the behavior to a new one: when num_gpus is not specified, we won't explicitly set CUDA_VISIBLE_DEVICES to empty string but instead leaving it untouched. This means zero-gpu tasks/actors will see whatever CUDA_VISIBLE_DEVICES value is there when ray start is called.

This is a behavior change so we want to collect feedbacks before we make the change. Please comment if you disagree with the proposal.

Metadata

Metadata

Assignees

Labels

P1Issue that should be fixed within a few weeksbugSomething that is supposed to be working; but isn'tcoreIssues that should be addressed in Ray Core

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions