Add functions to setup PrivateUse1 as a python backend device.#157859
Add functions to setup PrivateUse1 as a python backend device.#157859qihqi wants to merge 5 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/157859
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit 7464a8e with merge base c58e096 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
albanD
left a comment
There was a problem hiding this comment.
Thanks for looking into this !
|
Hmm... Specializing PrivateUse1HooksInterface and DeviceGuardImplInterface for PrivateUse1 is not a good idea directly . If you really want to implement this functionality based on pure Python, why not consider exposing these two classes to the Python side, and then let users inherit and implement the specific details of these two classes on the Python side? In addition, can |
This is indeed a better idea. I'll try to do that. Initially I picked the least intrusive thing (everything optional, and do as little as it can).
Indeed it does basically (https://dev-discuss.pytorch.org/t/embrace-tensor-subclass-as-a-python-device-registration-api/2771); and it is what I am using right now. There are places where it interacts poorly with torch, including the 2 issues that this PR fixes; as well as AMP integration. |
28225fa to
44ce110
Compare
|
I have an idea. Apology if there are any problems. Can we directly export In this way, users can integrate the classes on the Python side to implement the corresponding interfaces, and can also easily know which interfaces need to be implemented; what Of course, it is also possible to forward all operations from C++ back to Python like the current PR implementation, but we'd better provide a base class consistent with C++ for users to inherit, so that users can easily understand which interfaces may need to be implemented. |
|
@pytorchbot label "release notes: python_frontend" |
|
Made few changes as per discussion:
Let me know how it looks, also please start CI to see if any tests fails; I don't have the permission to do that. Thanks! |
|
Thank you.
The upgrade of Back to the problem itself, we do need to pay attention to the variable life cycle in C++ and Python. In my opinion, using the |
Indeed.
I can also make a separate PR to just update pybind11 just to be careful. would you think that is a good idea?
would you describe what is the life cycle issue? I thought the reason of why we need the header is because that is how to get definitions of |
The definition of
Also, it seems to me that |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot revert -m "introduce linting errors https://github.com/pytorch/pytorch/actions/runs/18123993236/job/51574878473" -c nosignal |
|
@pytorchbot successfully started a revert job. Check the current status here. |
#157859)" This reverts commit 1310d6a. Reverted #157859 on behalf of https://github.com/jeanschmidt due to introduce linting errors ([comment](#157859 (comment)))
|
@qihqi your PR has been successfully reverted. |
|
Hi @jeanschmidt would you point out what linting error you were seeing? I am looking at https://hud.pytorch.org/pr/157859 and couldnt find any |
|
Relevant to point out, the issue introduced seems to come from the job lintrunner-noclang / linux-job |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…ch#157859) Fixes pytorch#156052 and pytorch#156444. This PR setup the privateuseone key in Python to be used as a python backend for pytorch. Meaning that, after calling `setup_privateuseone_for_python_backend('npy')`, one can use a subclass to with that device to hold arbitrary python data as "device data" and use `torch.library` to register ops that takes that Tensor. Changes done in this PR: 1. Register an vanilla Device Guard: I extended NoOpDeviceGuard to have allow device index of 0 and to not raise errors when event related functions are accessed. If I don't do those, when calling backward I would get errors. (CPU backend uses NoOpDeviceGuard just fine, although there seems to be special treatment of CPU in the autograd engine. 2. Tensor subclass allows not having `__torch_dispatch__` if the device is not CUDA or CPU. The comment of the check suggests it was to avoid segfault when calling into ops that expects a storage. Here we have a different device so will not call into those ops. 3. python function that invokes the other incantations to setup the privateusekey backend. This took inspiration of https://github.com/bdhirsh/pytorch_open_registration_example and https://github.com/tinygrad/tinygrad/blob/master/extra/torch_backend/wrapped_tensor.cpp; great thanks to @bdhirsh and @geohot. Pull Request resolved: pytorch#157859 Approved by: https://github.com/albanD
|
FYI the test added here has been failing in periodic CI since it was added in october: https://hud.pytorch.org/failure?name=periodic%20%2F%20linux-jammy-cuda12.8-py3.10-gcc11-debug%20%2F%20test%20(default%2C%201%2C%207%2C%20linux.g6.4xlarge.experimental.nvidia.gpu%2C%20oncall%3Adebug-build)&jobName=undefined&failureCaptures=test%2Ftest_privateuseone_python_backend.py%3A%3APrivateUse1BackendTest%3A%3Atest_backend_simple We should fix this! (not super urgent). |
So sorry for this, i will fix it |


Fixes #156052 and #156444.
This PR setup the privateuseone key in Python to be used as a python backend for pytorch.
Meaning that, after calling
setup_privateuseone_for_python_backend('npy'), one can use a subclass to with that device to hold arbitrary python data as "device data" and usetorch.libraryto register ops that takes that Tensor.Changes done in this PR:
__torch_dispatch__if the device is not CUDA or CPU. The comment of the check suggests it was to avoid segfault when calling into ops that expects a storage. Here we have a different device so will not call into those ops.This took inspiration of https://github.com/bdhirsh/pytorch_open_registration_example and https://github.com/tinygrad/tinygrad/blob/master/extra/torch_backend/wrapped_tensor.cpp; great thanks to @bdhirsh and @geohot.