Skip to content

"RuntimeError: makeDeviceForHostname(): unsupported gloo device" with nightly torch 2.8 #150381

@AznamirWoW

Description

@AznamirWoW

🐛 Describe the bug

Nightly 2.8 torch results in an error during attempt to init a distributed training

import sys
import os
import torch.distributed as dist
from random import randint
import torch

os.environ["USE_LIBUV"] = "0" if sys.platform == "win32" else "1"

os.environ["MASTER_ADDR"] = "localhost"
os.environ["MASTER_PORT"] = str(randint(20000, 55555))

device = torch.device("cuda")
n_gpus = 1
rank = 0

dist.init_process_group(
	backend="gloo" if sys.platform == "win32" or device.type != "cuda" else "nccl",
	init_method="env://",
	world_size=n_gpus if device.type == "cuda" else 1,
	rank=rank if device.type == "cuda" else 0,
)

print("done")

Traceback (most recent call last):
File "T:\test.py", line 16, in
dist.init_process_group(
File "X:\torch\venv\Lib\site-packages\torch\distributed\c10d_logger.py", line 81, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "X:\torch\venv\Lib\site-packages\torch\distributed\c10d_logger.py", line 95, in wrapper
func_return = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "X:\torch\venv\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1724, in init_process_group
default_pg, _ = _new_process_group_helper(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "X:\torch\venv\Lib\site-packages\torch\distributed\distributed_c10d.py", line 1949, in _new_process_group_helper
backend_class = ProcessGroupGloo(
^^^^^^^^^^^^^^^^^
RuntimeError: makeDeviceForHostname(): unsupported gloo device

Versions

installed torch versions
torch-2.8.0.dev20250327+cu128 torchaudio-2.6.0.dev20250331+cu128 torchvision-0.22.0.dev20250331+cu128

cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @kwen2501 @c-p-i-o

Metadata

Metadata

Assignees

Labels

high prioritymodule: regressionIt used to work, and now it doesn'toncall: distributedAdd this issue/PR to distributed oncall triage queuetriage reviewtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions