Skip to content

[1/2] Introduce at::accelerator::Graph as a unified Graph interface#171269

Closed
guangyey wants to merge 20 commits intogh/guangyey/265/basefrom
gh/guangyey/265/head
Closed

[1/2] Introduce at::accelerator::Graph as a unified Graph interface#171269
guangyey wants to merge 20 commits intogh/guangyey/265/basefrom
gh/guangyey/265/head

Conversation

@guangyey
Copy link
Copy Markdown
Collaborator

@guangyey guangyey commented Dec 24, 2025

Stack from ghstack (oldest at bottom):

Motivation

The original goal was to generalize CUDAGraph and share implementations and logic across different backends, as mentioned in #158827. However, after further offline discussions, we decided to take a more incremental approach: start by defining a unified interface, while allowing each backend to maintain its own implementation. This avoids premature coupling and addresses backend-specific concerns.

This PR introduces GraphImplInterface, a lightweight, backend-agnostic interface that defines a unified API for graph capture and replay. Each backend (e.g., CUDA, XPU, PrivateUse1) provides its own implementation and registers it via REGISTER_GRAPH_IMPL.
On top of this interface, we provide a unified graph API, at::accelerator::Graph, which transparently maps to:

  • CUDAGraph on CUDA
  • XPUGraph on XPU
  • and corresponding implementations for other backends (including PrivateUse1)

This design establishes a common abstraction layer while preserving backend autonomy, and lays the groundwork for future sharing of logic once the interface and use cases have stabilized.

An additional benefit is that, for CUDA and XPU, the backend-specific graph types (e.g., cuda::CUDAGraph and xpu::XPUGraph) can share the same underlying implementation as accelerator::Graph on each backend, avoiding code duplication and ensuring consistent behavior.

For PrivateUse1, accelerator::Graph can be supported with minimal effort by reusing the existing PU1Graph implementation.

cc @albanD @eellison @EikanWang

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Dec 24, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/171269

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 62ab0c3 with merge base f72a552 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

guangyey added a commit that referenced this pull request Dec 24, 2025
@guangyey guangyey marked this pull request as draft December 24, 2025 15:36
@guangyey guangyey changed the title Introduce torch.accelerator.Graph as a unified Graph interface [WIP] Introduce torch.accelerator.Graph as a unified Graph interface Dec 24, 2025
@guangyey guangyey added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 24, 2025
[ghstack-poisoned]
[ghstack-poisoned]
@guangyey guangyey changed the title [WIP] Introduce torch.accelerator.Graph as a unified Graph interface [WIP][1/2] Introduce at::accelerator::Graph as a unified Graph interface Dec 25, 2025
@guangyey guangyey changed the title [WIP][1/2] Introduce at::accelerator::Graph as a unified Graph interface [WIP] [1/2] Introduce at::accelerator::Graph as a unified Graph interface Dec 25, 2025
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@guangyey guangyey changed the title [WIP] [1/2] Introduce at::accelerator::Graph as a unified Graph interface [1/2] Introduce at::accelerator::Graph as a unified Graph interface Dec 30, 2025
@guangyey guangyey added release notes: cpp release notes category module: accelerator Issues related to the shared accelerator API labels Dec 30, 2025
@guangyey
Copy link
Copy Markdown
Collaborator Author

guangyey commented Mar 4, 2026

Thanks @eellison
Try to land this PR.
@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@huydhn
Copy link
Copy Markdown
Contributor

huydhn commented Mar 11, 2026

@pytorchbot revert -m 'This is currently breaking some internal build, I need to revert and reland this' -c ghfirst

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Mar 11, 2026
…erface (#171269)"

This reverts commit 2afd3c1.

Reverted #171269 on behalf of https://github.com/huydhn due to This is currently breaking some internal build, I need to revert and reland this ([comment](#171269 (comment)))
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@guangyey your PR has been successfully reverted.

@pytorchmergebot pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Mar 11, 2026
@huydhn
Copy link
Copy Markdown
Contributor

huydhn commented Mar 11, 2026

@guangyey Please help do a rebase and reland this change

guangyey added a commit that referenced this pull request Mar 12, 2026
guangyey added a commit that referenced this pull request Mar 12, 2026
guangyey added a commit that referenced this pull request Mar 12, 2026
[ghstack-poisoned]
[ghstack-poisoned]
@guangyey
Copy link
Copy Markdown
Collaborator Author

Try to reland.
@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

sandy-gags pushed a commit to sandy-gags/pytorch that referenced this pull request Mar 12, 2026
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
…ytorch#171269)

# Motivation
The original goal was to generalize `CUDAGraph` and share implementations and logic across different backends, as mentioned in pytorch#158827. However, after further offline discussions, we decided to take a more incremental approach: start by defining a unified interface, while allowing each backend to maintain its own implementation. This avoids premature coupling and addresses backend-specific concerns.

This PR introduces `GraphImplInterface`, a lightweight, backend-agnostic interface that defines a unified API for graph capture and replay. Each backend (e.g., `CUDA`, `XPU`, `PrivateUse1`) provides its own implementation and registers it via `REGISTER_GRAPH_IMPL`.
On top of this interface, we provide a unified graph API, `at::accelerator::Graph`, which transparently maps to:
- `CUDAGraph` on CUDA
- `XPUGraph` on XPU
- and corresponding implementations for other backends (including `PrivateUse1`)

This design establishes a common abstraction layer while preserving backend autonomy, and lays the groundwork for future sharing of logic once the interface and use cases have stabilized.

An additional benefit is that, for `CUDA` and `XPU`, the backend-specific graph types (e.g., `cuda::CUDAGraph` and `xpu::XPUGraph`) can share the same underlying implementation as `accelerator::Graph` on each backend, avoiding code duplication and ensuring consistent behavior.

For `PrivateUse1`, `accelerator::Graph` can be supported with minimal effort by reusing the existing `PU1Graph` implementation.

Pull Request resolved: pytorch#171269
Approved by: https://github.com/EikanWang, https://github.com/eellison
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
…erface (pytorch#171269)"

This reverts commit 2afd3c1.

Reverted pytorch#171269 on behalf of https://github.com/huydhn due to This is currently breaking some internal build, I need to revert and reland this ([comment](pytorch#171269 (comment)))
EmanueleCoradin pushed a commit to EmanueleCoradin/pytorch that referenced this pull request Mar 30, 2026
…ytorch#171269)

# Motivation
The original goal was to generalize `CUDAGraph` and share implementations and logic across different backends, as mentioned in pytorch#158827. However, after further offline discussions, we decided to take a more incremental approach: start by defining a unified interface, while allowing each backend to maintain its own implementation. This avoids premature coupling and addresses backend-specific concerns.

This PR introduces `GraphImplInterface`, a lightweight, backend-agnostic interface that defines a unified API for graph capture and replay. Each backend (e.g., `CUDA`, `XPU`, `PrivateUse1`) provides its own implementation and registers it via `REGISTER_GRAPH_IMPL`.
On top of this interface, we provide a unified graph API, `at::accelerator::Graph`, which transparently maps to:
- `CUDAGraph` on CUDA
- `XPUGraph` on XPU
- and corresponding implementations for other backends (including `PrivateUse1`)

This design establishes a common abstraction layer while preserving backend autonomy, and lays the groundwork for future sharing of logic once the interface and use cases have stabilized.

An additional benefit is that, for `CUDA` and `XPU`, the backend-specific graph types (e.g., `cuda::CUDAGraph` and `xpu::XPUGraph`) can share the same underlying implementation as `accelerator::Graph` on each backend, avoiding code duplication and ensuring consistent behavior.

For `PrivateUse1`, `accelerator::Graph` can be supported with minimal effort by reusing the existing `PU1Graph` implementation.

Pull Request resolved: pytorch#171269
Approved by: https://github.com/EikanWang, https://github.com/eellison
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-no-td Do not run TD on this PR ciflow/trunk Trigger trunk jobs on your pull request Merged module: accelerator Issues related to the shared accelerator API open source release notes: cpp release notes category Reverted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants