Skip to content

[OpenReg] Add Event&Stream Support for OpenReg Backend#160099

Closed
fffrog wants to merge 17 commits intogh/fffrog/131/basefrom
gh/fffrog/131/head
Closed

[OpenReg] Add Event&Stream Support for OpenReg Backend#160099
fffrog wants to merge 17 commits intogh/fffrog/131/basefrom
gh/fffrog/131/head

Conversation

@fffrog
Copy link
Collaborator

@fffrog fffrog commented Aug 7, 2025

Stack from ghstack (oldest at bottom):

Referring to the signatures and functions of Stream and Event in CUDA, we use CPU multithreading
and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg.

Changes:

  • Add stream capabilities for OpenReg
  • Add event capabilities for OpenReg
  • Add kernel launch entrypoint for OpenReg
  • Add testcases about stream and event for OpenReg
  • Add example for OpenReg

[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added the topic: not user facing topic category label Aug 7, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 7, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160099

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 97c71da with merge base d8a0bdb (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

fffrog added 4 commits August 8, 2025 18:32
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@fffrog fffrog marked this pull request as ready for review August 12, 2025 11:47
@fffrog fffrog requested a review from albanD August 12, 2025 11:49
[ghstack-poisoned]
can-gaa-hou pushed a commit to can-gaa-hou/pytorch that referenced this pull request Aug 14, 2025
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading
and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg.

ghstack-source-id: 1c28226
Pull-Request-resolved: pytorch#160099
fffrog added 2 commits August 19, 2025 16:45
[ghstack-poisoned]
[ghstack-poisoned]
can-gaa-hou pushed a commit to can-gaa-hou/pytorch that referenced this pull request Aug 22, 2025
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading
and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg.

ghstack-source-id: 4f7e098
Pull-Request-resolved: pytorch#160099
[ghstack-poisoned]
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great at a high level.
It would be great to have some tests to make sure that the extension points behave as expected!

fffrog added 2 commits August 27, 2025 17:16
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@fffrog
Copy link
Collaborator Author

fffrog commented Aug 27, 2025

It would be great to have some tests to make sure that the extension points behave as expected!

Thank you very much, will add related tests right now.

[ghstack-poisoned]
Comment on lines +1 to +3
cmake_minimum_required(VERSION 3.18 FATAL_ERROR)

project(TORCH_OPENREG CXX C)
Copy link
Collaborator Author

@fffrog fffrog Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasons for adding these three lines to the CMakeLists file are as follows:

  • Part 1: as a subdirectory of torch_openreg
  • Part 2: as an entry point for a separate build, as described in the examples section of the README.md file.

@fffrog
Copy link
Collaborator Author

fffrog commented Aug 29, 2025

Hi, @albanD, sorry to disrupt you again

The new commit is ready, please help to take a look at it if you are free ,thank you a lot.

Changes

  • Integrate googletest into OpenReg and the testcases will be trigger when building OpenReg defaultly
  • Update the README.md

Partial Outputs:

  -- Added CUDA NVCC flags for: -gencode;arch=compute_75,code=sm_75
  -- Found Torch: /root/Git.d/pytorch/pytorch/torch/lib/libtorch.so
  -- Using GTest at /root/Git.d/pytorch/pytorch/third_party/googletest
  -- Configuring done (4.5s)
  -- Generating done (0.0s)
  -- Build files have been written to: /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build
  [  3%] Building CXX object third_party/openreg/CMakeFiles/openreg.dir/csrc/device.cpp.o
  [  6%] Building CXX object third_party/openreg/CMakeFiles/openreg.dir/csrc/memory.cpp.o
  [ 13%] Building CXX object third_party/openreg/CMakeFiles/openreg.dir/csrc/stream.cpp.o
  [ 13%] Building CXX object googletest_build/googletest/CMakeFiles/gtest_main.dir/src/gtest_main.cc.o
  [ 17%] Linking CXX shared library libopenreg.so
  [ 17%] Built target openreg
  [ 20%] Linking CXX static library ../../lib/libgtest_main.a
  [ 24%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/OpenRegExtra.cpp.o
  [ 31%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/OpenRegMinimal.cpp.o
  [ 31%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/native/Extra.cpp.o
  [ 34%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/native/Minimal.cpp.o
  [ 37%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegDeviceAllocator.cpp.o
  [ 41%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegException.cpp.o
  [ 44%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegFunctions.cpp.o
  [ 48%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegGenerator.cpp.o
  [ 51%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegGuard.cpp.o
  [ 55%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegHooks.cpp.o
  [ 62%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegSerialization.cpp.o
  [ 62%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegHostAllocator.cpp.o
  [ 65%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegStream.cpp.o
  [ 65%] Built target gtest_main
  [ 68%] Building CXX object googletest_build/googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o
  [ 72%] Linking CXX static library ../../lib/libgtest.a
  [ 72%] Built target gtest
  [ 75%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/device_tests.cpp.o
  [ 82%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/event_tests.cpp.o
  [ 82%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/memory_tests.cpp.o
  [ 86%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/stream_tests.cpp.o
  [ 89%] Linking CXX executable ortests
  UpdateCTestConfiguration  from :/root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg/DartConfiguration.tcl
  Test project /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg
  Constructing a list of tests
  Done constructing a list of tests
  Updating test list for fixtures
  Added 0 tests to meet fixture requirements
  Checking test dependency graph...
  Checking test dependency graph end
  test 1
      Start 1: alltests

  1: Test command: /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg/ortests
  1: Working Directory: /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg
  1: Test timeout computed to be: 9999879
  1: Running main() from /root/Git.d/pytorch/pytorch/third_party/googletest/googletest/src/gtest_main.cc
  1: [==========] Running 23 tests from 4 test suites.
  1: [----------] Global test environment set-up.
  1: [----------] 4 tests from DeviceTestFixture
  1: [ RUN      ] DeviceTestFixture.GetDeviceCountValid
  1: [       OK ] DeviceTestFixture.GetDeviceCountValid (0 ms)
  1: [ RUN      ] DeviceTestFixture.GetDeviceValid
  1: [       OK ] DeviceTestFixture.GetDeviceValid (0 ms)
  1: [ RUN      ] DeviceTestFixture.SetDeviceValid
  1: [       OK ] DeviceTestFixture.SetDeviceValid (0 ms)
  1: [ RUN      ] DeviceTestFixture.SetDeviceInvalidNegative
  1: [       OK ] DeviceTestFixture.SetDeviceInvalidNegative (0 ms)
  1: [----------] 4 tests from DeviceTestFixture (0 ms total)
  1:
  1: [----------] 5 tests from EventTest
  1: [ RUN      ] EventTest.EventCreateAndDestroy
  1: [       OK ] EventTest.EventCreateAndDestroy (0 ms)
  1: [ RUN      ] EventTest.EventCreateWithFlagsTiming
  1: [       OK ] EventTest.EventCreateWithFlagsTiming (0 ms)
  1: [ RUN      ] EventTest.EventRecordAndSynchronize
  1: [       OK ] EventTest.EventRecordAndSynchronize (0 ms)
  1: [ RUN      ] EventTest.EventElapsedTime
  1: [       OK ] EventTest.EventElapsedTime (10 ms)
  1: [ RUN      ] EventTest.StreamWaitEvent
  1: [       OK ] EventTest.StreamWaitEvent (0 ms)
  1: [----------] 5 tests from EventTest (10 ms total)
  1:
  1: [----------] 9 tests from MemoryManagerTest
  1: [ RUN      ] MemoryManagerTest.AllocateAndFreeDevice
  1: [       OK ] MemoryManagerTest.AllocateAndFreeDevice (0 ms)
  1: [ RUN      ] MemoryManagerTest.AllocateAndFreeHost
  1: [       OK ] MemoryManagerTest.AllocateAndFreeHost (0 ms)
  1: [ RUN      ] MemoryManagerTest.AllocateNullptr
  1: [       OK ] MemoryManagerTest.AllocateNullptr (0 ms)
  1: [ RUN      ] MemoryManagerTest.AllocateZeroSize
  1: [       OK ] MemoryManagerTest.AllocateZeroSize (0 ms)
  1: [ RUN      ] MemoryManagerTest.MemcpyHostToDevice
  1: [       OK ] MemoryManagerTest.MemcpyHostToDevice (0 ms)
  1: [ RUN      ] MemoryManagerTest.MemcpyDeviceToDevice
  1: [       OK ] MemoryManagerTest.MemcpyDeviceToDevice (0 ms)
  1: [ RUN      ] MemoryManagerTest.MemcpyInvalidKind
  1: [       OK ] MemoryManagerTest.MemcpyInvalidKind (0 ms)
  1: [ RUN      ] MemoryManagerTest.PointerAttributes
  1: [       OK ] MemoryManagerTest.PointerAttributes (0 ms)
  1: [ RUN      ] MemoryManagerTest.ProtectUnprotectDevice
  1: [       OK ] MemoryManagerTest.ProtectUnprotectDevice (0 ms)
  1: [----------] 9 tests from MemoryManagerTest (0 ms total)
  1:
  1: [----------] 5 tests from StreamTest
  1: [ RUN      ] StreamTest.StreamCreateAndDestroy
  1: [       OK ] StreamTest.StreamCreateAndDestroy (0 ms)
  1: [ RUN      ] StreamTest.StreamCreateWithInvalidPriority
  1: [       OK ] StreamTest.StreamCreateWithInvalidPriority (0 ms)
  1: [ RUN      ] StreamTest.StreamTaskExecution
  1: [       OK ] StreamTest.StreamTaskExecution (0 ms)
  1: [ RUN      ] StreamTest.StreamQuery
  1: [       OK ] StreamTest.StreamQuery (0 ms)
  1: [ RUN      ] StreamTest.DeviceSynchronize
  1: [       OK ] StreamTest.DeviceSynchronize (0 ms)
  1: [----------] 5 tests from StreamTest (0 ms total)
  1:
  1: [----------] Global test environment tear-down
  1: [==========] 23 tests from 4 test suites ran. (11 ms total)
  1: [  PASSED  ] 23 tests.
  1/1 Test #1: alltests .........................   Passed    0.01 sec

  100% tests passed, 0 tests failed out of 1

  Total Test time (real) =   0.01 sec
  [ 89%] Built target ortests
  [ 93%] Linking CXX shared library libtorch_openreg.so
  [ 93%] Built target torch_openreg
  [ 96%] Building CXX object torch_openreg/csrc/CMakeFiles/torch_bindings.dir/Module.cpp.o
  [100%] Linking CXX shared library libtorch_bindings.so
  [100%] Built target torch_bindings
  Install the project...

[ghstack-poisoned]
@fffrog fffrog requested a review from albanD August 29, 2025 14:48
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. Looks greats!

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #160100

fffrog added 2 commits August 30, 2025 12:02
[ghstack-poisoned]
[ghstack-poisoned]
fffrog added a commit that referenced this pull request Aug 30, 2025
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading
and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg.

**Changes:**

- Add stream capabilities for OpenReg
- Add event capabilities for OpenReg
- Add kernel launch entrypoint for OpenReg
- Add testcases about stream and event for OpenReg
- Add example for OpenReg

ghstack-source-id: 3c06c94
Pull-Request-resolved: #160099
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #160100

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #160100

pytorchmergebot pushed a commit that referenced this pull request Aug 30, 2025
As the title stated.
Pull Request resolved: #161773
Approved by: https://github.com/albanD
ghstack dependencies: #161603, #160099
pytorchmergebot pushed a commit that referenced this pull request Aug 30, 2025
…160100)

We integrated the openreg backend’s `Stream` and `Event` into PyTorch, all of which are similar
to other accelerators like `CUDA`, `XPUs`, etc.
Pull Request resolved: #160100
Approved by: https://github.com/albanD
ghstack dependencies: #161603, #160099, #161773
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading
and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg.

**Changes:**

- Add stream capabilities for OpenReg
- Add event capabilities for OpenReg
- Add kernel launch entrypoint for OpenReg
- Add testcases about stream and event for OpenReg
- Add example for OpenReg
Pull Request resolved: pytorch#160099
Approved by: https://github.com/albanD
ghstack dependencies: pytorch#161603
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…ytorch#160100)

We integrated the openreg backend’s `Stream` and `Event` into PyTorch, all of which are similar
to other accelerators like `CUDA`, `XPUs`, etc.
Pull Request resolved: pytorch#160100
Approved by: https://github.com/albanD
ghstack dependencies: pytorch#161603, pytorch#160099, pytorch#161773
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading
and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg.

**Changes:**

- Add stream capabilities for OpenReg
- Add event capabilities for OpenReg
- Add kernel launch entrypoint for OpenReg
- Add testcases about stream and event for OpenReg
- Add example for OpenReg
Pull Request resolved: pytorch#160099
Approved by: https://github.com/albanD
ghstack dependencies: pytorch#161603
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
…ytorch#160100)

We integrated the openreg backend’s `Stream` and `Event` into PyTorch, all of which are similar
to other accelerators like `CUDA`, `XPUs`, etc.
Pull Request resolved: pytorch#160100
Approved by: https://github.com/albanD
ghstack dependencies: pytorch#161603, pytorch#160099, pytorch#161773
@github-actions github-actions bot deleted the gh/fffrog/131/head branch September 30, 2025 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants