Skip to content

Move backend-specific c10d files into per-backend subfolders (#187083)#187083

Open
d4l3k wants to merge 1 commit into
pytorch:mainfrom
d4l3k:export-D108332288
Open

Move backend-specific c10d files into per-backend subfolders (#187083)#187083
d4l3k wants to merge 1 commit into
pytorch:mainfrom
d4l3k:export-D108332288

Conversation

@d4l3k

@d4l3k d4l3k commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary:

Reorganizes torch/csrc/distributed/c10d by moving non-public, backend-specific implementation files and the TCPStore backend files into per-backend subfolders, while leaving the public-facing classes at the top level (the ProcessGroupGloo/NCCL/MPI/UCC backends and the Store/TCPStore/FileStore/HashStore/PrefixStore classes all stay put).

The moves are: store/ gets TCPStoreBackend.{cpp,hpp} and TCPStoreLibUvBackend.cpp; gloo/ gets ProcessGroupGlooCuda.cpp, ProcessGroupGlooDetail.hpp, and GlooDeviceFactory.{cpp,hpp}; ucc/ gets UCCTracing.{cpp,hpp} and UCCUtils.{cpp,hpp}; nccl/ gets NCCLXStub.hpp.

NCCLUtils.{cpp,hpp} was deliberately kept at the top level even though it is backend-specific: it is included by several call sites outside caffe2 (in gen_ai, ads_mkl, and fbgemm_gpu), so relocating it would be a wider, riskier change better done on its own. As a result the new nccl/ folder currently holds only NCCLXStub.hpp.

All include sites were updated, covering both the canonical torch/csrc/distributed/c10d/... include form and the legacy short c10d/... form (used by fb/GlooDeviceFactory.cpp). Build wiring was updated in build_variables.bzl -- the canonical source list consumed by CMake (via append_filelist in cmake/Codegen.cmake), OSS Bazel, and OSS Buck -- and in the internal fb/fbcode/target_definitions.bzl for ProcessGroupGlooCuda.cpp. Headers are picked up by recursive globs, so no header-list edits were needed.

This is a pure file move: contents are unchanged apart from the relocated #include paths, so correctness is established by a clean build rather than by behavioral tests.

Authored with the assistance of an AI coding assistant (Claude Code).

Test Plan:
Confirmed no references to the old paths remain anywhere in fbcode, then ran the fbcode lint and build tooling:

arc f
arc lint
arc lint --take AUTODEPS --apply-patches
buck2 build fbcode//caffe2:_libtorch fbcode//caffe2:_libtorch_cuda

arc f and arc lint reported no issues; AUTODEPS produced no dependency changes (the moves stayed within existing Buck targets); both the CPU (_libtorch) and CUDA (_libtorch_cuda) libraries built successfully (exit 0).

Reviewed By: kapilsh

Differential Revision: D108332288

@d4l3k d4l3k requested review from fduwjj and kapilsh as code owners June 11, 2026 21:01
@pytorch-bot

pytorch-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/187083

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 73 Pending

As of commit dd9f5b7 with merge base 083e261 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot Bot added the release notes: distributed (c10d) release notes category label Jun 11, 2026
@meta-codesync

meta-codesync Bot commented Jun 11, 2026

Copy link
Copy Markdown

@d4l3k has exported this pull request. If you are a Meta employee, you can view the originating Diff in D108332288.

@kapilsh kapilsh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pytorch-bot pytorch-bot Bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jun 11, 2026
@meta-codesync meta-codesync Bot changed the title Move backend-specific c10d files into per-backend subfolders Move backend-specific c10d files into per-backend subfolders (#187083) Jun 11, 2026
d4l3k added a commit to d4l3k/pytorch that referenced this pull request Jun 11, 2026
…#187083)

Summary:
Pull Request resolved: pytorch#187083

Reorganizes `torch/csrc/distributed/c10d` by moving non-public, backend-specific implementation files and the TCPStore backend files into per-backend subfolders, while leaving the public-facing classes at the top level (the `ProcessGroupGloo`/`NCCL`/`MPI`/`UCC` backends and the `Store`/`TCPStore`/`FileStore`/`HashStore`/`PrefixStore` classes all stay put).

The moves are: `store/` gets `TCPStoreBackend.{cpp,hpp}` and `TCPStoreLibUvBackend.cpp`; `gloo/` gets `ProcessGroupGlooCuda.cpp`, `ProcessGroupGlooDetail.hpp`, and `GlooDeviceFactory.{cpp,hpp}`; `ucc/` gets `UCCTracing.{cpp,hpp}` and `UCCUtils.{cpp,hpp}`; `nccl/` gets `NCCLXStub.hpp`.

`NCCLUtils.{cpp,hpp}` was deliberately kept at the top level even though it is backend-specific: it is included by several call sites outside `caffe2` (in `gen_ai`, `ads_mkl`, and `fbgemm_gpu`), so relocating it would be a wider, riskier change better done on its own. As a result the new `nccl/` folder currently holds only `NCCLXStub.hpp`.

All include sites were updated, covering both the canonical `torch/csrc/distributed/c10d/...` include form and the legacy short `c10d/...` form (used by `fb/GlooDeviceFactory.cpp`). Build wiring was updated in `build_variables.bzl` -- the canonical source list consumed by CMake (via `append_filelist` in `cmake/Codegen.cmake`), OSS Bazel, and OSS Buck -- and in the internal `fb/fbcode/target_definitions.bzl` for `ProcessGroupGlooCuda.cpp`. Headers are picked up by recursive globs, so no header-list edits were needed.

This is a pure file move: contents are unchanged apart from the relocated `#include` paths, so correctness is established by a clean build rather than by behavioral tests.

Authored with the assistance of an AI coding assistant (Claude Code).

Test Plan:
Confirmed no references to the old paths remain anywhere in `fbcode`, then ran the fbcode lint and build tooling:

```
arc f
arc lint
arc lint --take AUTODEPS --apply-patches
buck2 build fbcode//caffe2:_libtorch fbcode//caffe2:_libtorch_cuda
```

`arc f` and `arc lint` reported no issues; AUTODEPS produced no dependency changes (the moves stayed within existing Buck targets); both the CPU (`_libtorch`) and CUDA (`_libtorch_cuda`) libraries built successfully (exit 0).

Differential Revision: D108332288
@d4l3k d4l3k force-pushed the export-D108332288 branch from 91db222 to 6899b93 Compare June 11, 2026 22:16
d4l3k added a commit to d4l3k/pytorch that referenced this pull request Jun 11, 2026
…#187083)

Summary:
Pull Request resolved: pytorch#187083

Reorganizes `torch/csrc/distributed/c10d` by moving non-public, backend-specific implementation files and the TCPStore backend files into per-backend subfolders, while leaving the public-facing classes at the top level (the `ProcessGroupGloo`/`NCCL`/`MPI`/`UCC` backends and the `Store`/`TCPStore`/`FileStore`/`HashStore`/`PrefixStore` classes all stay put).

The moves are: `store/` gets `TCPStoreBackend.{cpp,hpp}` and `TCPStoreLibUvBackend.cpp`; `gloo/` gets `ProcessGroupGlooCuda.cpp`, `ProcessGroupGlooDetail.hpp`, and `GlooDeviceFactory.{cpp,hpp}`; `ucc/` gets `UCCTracing.{cpp,hpp}` and `UCCUtils.{cpp,hpp}`; `nccl/` gets `NCCLXStub.hpp`.

`NCCLUtils.{cpp,hpp}` was deliberately kept at the top level even though it is backend-specific: it is included by several call sites outside `caffe2` (in `gen_ai`, `ads_mkl`, and `fbgemm_gpu`), so relocating it would be a wider, riskier change better done on its own. As a result the new `nccl/` folder currently holds only `NCCLXStub.hpp`.

All include sites were updated, covering both the canonical `torch/csrc/distributed/c10d/...` include form and the legacy short `c10d/...` form (used by `fb/GlooDeviceFactory.cpp`). Build wiring was updated in `build_variables.bzl` -- the canonical source list consumed by CMake (via `append_filelist` in `cmake/Codegen.cmake`), OSS Bazel, and OSS Buck -- and in the internal `fb/fbcode/target_definitions.bzl` for `ProcessGroupGlooCuda.cpp`. Headers are picked up by recursive globs, so no header-list edits were needed.

This is a pure file move: contents are unchanged apart from the relocated `#include` paths, so correctness is established by a clean build rather than by behavioral tests.

Authored with the assistance of an AI coding assistant (Claude Code).

Test Plan:
Confirmed no references to the old paths remain anywhere in `fbcode`, then ran the fbcode lint and build tooling:

```
arc f
arc lint
arc lint --take AUTODEPS --apply-patches
buck2 build fbcode//caffe2:_libtorch fbcode//caffe2:_libtorch_cuda
```

`arc f` and `arc lint` reported no issues; AUTODEPS produced no dependency changes (the moves stayed within existing Buck targets); both the CPU (`_libtorch`) and CUDA (`_libtorch_cuda`) libraries built successfully (exit 0).

Differential Revision: D108332288
@d4l3k d4l3k force-pushed the export-D108332288 branch from 6899b93 to 1da5c87 Compare June 11, 2026 22:23
…#187083)

Summary:
Pull Request resolved: pytorch#187083

Reorganizes `torch/csrc/distributed/c10d` by moving non-public, backend-specific implementation files and the TCPStore backend files into per-backend subfolders, while leaving the public-facing classes at the top level (the `ProcessGroupGloo`/`NCCL`/`MPI`/`UCC` backends and the `Store`/`TCPStore`/`FileStore`/`HashStore`/`PrefixStore` classes all stay put).

The moves are: `store/` gets `TCPStoreBackend.{cpp,hpp}` and `TCPStoreLibUvBackend.cpp`; `gloo/` gets `ProcessGroupGlooCuda.cpp`, `ProcessGroupGlooDetail.hpp`, and `GlooDeviceFactory.{cpp,hpp}`; `ucc/` gets `UCCTracing.{cpp,hpp}` and `UCCUtils.{cpp,hpp}`; `nccl/` gets `NCCLXStub.hpp`.

`NCCLUtils.{cpp,hpp}` was deliberately kept at the top level even though it is backend-specific: it is included by several call sites outside `caffe2` (in `gen_ai`, `ads_mkl`, and `fbgemm_gpu`), so relocating it would be a wider, riskier change better done on its own. As a result the new `nccl/` folder currently holds only `NCCLXStub.hpp`.

All include sites were updated, covering both the canonical `torch/csrc/distributed/c10d/...` include form and the legacy short `c10d/...` form (used by `fb/GlooDeviceFactory.cpp`). Build wiring was updated in `build_variables.bzl` -- the canonical source list consumed by CMake (via `append_filelist` in `cmake/Codegen.cmake`), OSS Bazel, and OSS Buck -- and in the internal `fb/fbcode/target_definitions.bzl` for `ProcessGroupGlooCuda.cpp`. Headers are picked up by recursive globs, so no header-list edits were needed.

This is a pure file move: contents are unchanged apart from the relocated `#include` paths, so correctness is established by a clean build rather than by behavioral tests.

Authored with the assistance of an AI coding assistant (Claude Code).

Test Plan:
Confirmed no references to the old paths remain anywhere in `fbcode`, then ran the fbcode lint and build tooling:

```
arc f
arc lint
arc lint --take AUTODEPS --apply-patches
buck2 build fbcode//caffe2:_libtorch fbcode//caffe2:_libtorch_cuda
```

`arc f` and `arc lint` reported no issues; AUTODEPS produced no dependency changes (the moves stayed within existing Buck targets); both the CPU (`_libtorch`) and CUDA (`_libtorch_cuda`) libraries built successfully (exit 0).

Reviewed By: kapilsh

Differential Revision: D108332288
@d4l3k d4l3k force-pushed the export-D108332288 branch from 1da5c87 to dd9f5b7 Compare June 12, 2026 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request meta-exported release notes: distributed (c10d) release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants