[OpenReg] Add Event&Stream Support for OpenReg Backend#160099
[OpenReg] Add Event&Stream Support for OpenReg Backend#160099fffrog wants to merge 17 commits intogh/fffrog/131/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/160099
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 97c71da with merge base d8a0bdb ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg. ghstack-source-id: 1c28226 Pull-Request-resolved: pytorch#160099
...cpp_extensions/open_registration_extension/torch_openreg/third_party/openreg/csrc/stream.cpp
Outdated
Show resolved
Hide resolved
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg. ghstack-source-id: 4f7e098 Pull-Request-resolved: pytorch#160099
albanD
left a comment
There was a problem hiding this comment.
This looks great at a high level.
It would be great to have some tests to make sure that the extension points behave as expected!
Thank you very much, will add related tests right now. |
| cmake_minimum_required(VERSION 3.18 FATAL_ERROR) | ||
|
|
||
| project(TORCH_OPENREG CXX C) |
There was a problem hiding this comment.
The reasons for adding these three lines to the CMakeLists file are as follows:
- Part 1: as a subdirectory of
torch_openreg - Part 2: as an entry point for a separate build, as described in the examples section of the README.md file.
|
Hi, @albanD, sorry to disrupt you again The new commit is ready, please help to take a look at it if you are free ,thank you a lot. Changes
Partial Outputs: -- Added CUDA NVCC flags for: -gencode;arch=compute_75,code=sm_75
-- Found Torch: /root/Git.d/pytorch/pytorch/torch/lib/libtorch.so
-- Using GTest at /root/Git.d/pytorch/pytorch/third_party/googletest
-- Configuring done (4.5s)
-- Generating done (0.0s)
-- Build files have been written to: /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build
[ 3%] Building CXX object third_party/openreg/CMakeFiles/openreg.dir/csrc/device.cpp.o
[ 6%] Building CXX object third_party/openreg/CMakeFiles/openreg.dir/csrc/memory.cpp.o
[ 13%] Building CXX object third_party/openreg/CMakeFiles/openreg.dir/csrc/stream.cpp.o
[ 13%] Building CXX object googletest_build/googletest/CMakeFiles/gtest_main.dir/src/gtest_main.cc.o
[ 17%] Linking CXX shared library libopenreg.so
[ 17%] Built target openreg
[ 20%] Linking CXX static library ../../lib/libgtest_main.a
[ 24%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/OpenRegExtra.cpp.o
[ 31%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/OpenRegMinimal.cpp.o
[ 31%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/native/Extra.cpp.o
[ 34%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/aten/native/Minimal.cpp.o
[ 37%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegDeviceAllocator.cpp.o
[ 41%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegException.cpp.o
[ 44%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegFunctions.cpp.o
[ 48%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegGenerator.cpp.o
[ 51%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegGuard.cpp.o
[ 55%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegHooks.cpp.o
[ 62%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegSerialization.cpp.o
[ 62%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegHostAllocator.cpp.o
[ 65%] Building CXX object csrc/CMakeFiles/torch_openreg.dir/runtime/OpenRegStream.cpp.o
[ 65%] Built target gtest_main
[ 68%] Building CXX object googletest_build/googletest/CMakeFiles/gtest.dir/src/gtest-all.cc.o
[ 72%] Linking CXX static library ../../lib/libgtest.a
[ 72%] Built target gtest
[ 75%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/device_tests.cpp.o
[ 82%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/event_tests.cpp.o
[ 82%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/memory_tests.cpp.o
[ 86%] Building CXX object third_party/openreg/CMakeFiles/ortests.dir/tests/stream_tests.cpp.o
[ 89%] Linking CXX executable ortests
UpdateCTestConfiguration from :/root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg/DartConfiguration.tcl
Test project /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 1
Start 1: alltests
1: Test command: /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg/ortests
1: Working Directory: /root/Git.d/pytorch/pytorch/test/cpp_extensions/open_registration_extension/torch_openreg/build/third_party/openreg
1: Test timeout computed to be: 9999879
1: Running main() from /root/Git.d/pytorch/pytorch/third_party/googletest/googletest/src/gtest_main.cc
1: [==========] Running 23 tests from 4 test suites.
1: [----------] Global test environment set-up.
1: [----------] 4 tests from DeviceTestFixture
1: [ RUN ] DeviceTestFixture.GetDeviceCountValid
1: [ OK ] DeviceTestFixture.GetDeviceCountValid (0 ms)
1: [ RUN ] DeviceTestFixture.GetDeviceValid
1: [ OK ] DeviceTestFixture.GetDeviceValid (0 ms)
1: [ RUN ] DeviceTestFixture.SetDeviceValid
1: [ OK ] DeviceTestFixture.SetDeviceValid (0 ms)
1: [ RUN ] DeviceTestFixture.SetDeviceInvalidNegative
1: [ OK ] DeviceTestFixture.SetDeviceInvalidNegative (0 ms)
1: [----------] 4 tests from DeviceTestFixture (0 ms total)
1:
1: [----------] 5 tests from EventTest
1: [ RUN ] EventTest.EventCreateAndDestroy
1: [ OK ] EventTest.EventCreateAndDestroy (0 ms)
1: [ RUN ] EventTest.EventCreateWithFlagsTiming
1: [ OK ] EventTest.EventCreateWithFlagsTiming (0 ms)
1: [ RUN ] EventTest.EventRecordAndSynchronize
1: [ OK ] EventTest.EventRecordAndSynchronize (0 ms)
1: [ RUN ] EventTest.EventElapsedTime
1: [ OK ] EventTest.EventElapsedTime (10 ms)
1: [ RUN ] EventTest.StreamWaitEvent
1: [ OK ] EventTest.StreamWaitEvent (0 ms)
1: [----------] 5 tests from EventTest (10 ms total)
1:
1: [----------] 9 tests from MemoryManagerTest
1: [ RUN ] MemoryManagerTest.AllocateAndFreeDevice
1: [ OK ] MemoryManagerTest.AllocateAndFreeDevice (0 ms)
1: [ RUN ] MemoryManagerTest.AllocateAndFreeHost
1: [ OK ] MemoryManagerTest.AllocateAndFreeHost (0 ms)
1: [ RUN ] MemoryManagerTest.AllocateNullptr
1: [ OK ] MemoryManagerTest.AllocateNullptr (0 ms)
1: [ RUN ] MemoryManagerTest.AllocateZeroSize
1: [ OK ] MemoryManagerTest.AllocateZeroSize (0 ms)
1: [ RUN ] MemoryManagerTest.MemcpyHostToDevice
1: [ OK ] MemoryManagerTest.MemcpyHostToDevice (0 ms)
1: [ RUN ] MemoryManagerTest.MemcpyDeviceToDevice
1: [ OK ] MemoryManagerTest.MemcpyDeviceToDevice (0 ms)
1: [ RUN ] MemoryManagerTest.MemcpyInvalidKind
1: [ OK ] MemoryManagerTest.MemcpyInvalidKind (0 ms)
1: [ RUN ] MemoryManagerTest.PointerAttributes
1: [ OK ] MemoryManagerTest.PointerAttributes (0 ms)
1: [ RUN ] MemoryManagerTest.ProtectUnprotectDevice
1: [ OK ] MemoryManagerTest.ProtectUnprotectDevice (0 ms)
1: [----------] 9 tests from MemoryManagerTest (0 ms total)
1:
1: [----------] 5 tests from StreamTest
1: [ RUN ] StreamTest.StreamCreateAndDestroy
1: [ OK ] StreamTest.StreamCreateAndDestroy (0 ms)
1: [ RUN ] StreamTest.StreamCreateWithInvalidPriority
1: [ OK ] StreamTest.StreamCreateWithInvalidPriority (0 ms)
1: [ RUN ] StreamTest.StreamTaskExecution
1: [ OK ] StreamTest.StreamTaskExecution (0 ms)
1: [ RUN ] StreamTest.StreamQuery
1: [ OK ] StreamTest.StreamQuery (0 ms)
1: [ RUN ] StreamTest.DeviceSynchronize
1: [ OK ] StreamTest.DeviceSynchronize (0 ms)
1: [----------] 5 tests from StreamTest (0 ms total)
1:
1: [----------] Global test environment tear-down
1: [==========] 23 tests from 4 test suites ran. (11 ms total)
1: [ PASSED ] 23 tests.
1/1 Test #1: alltests ......................... Passed 0.01 sec
100% tests passed, 0 tests failed out of 1
Total Test time (real) = 0.01 sec
[ 89%] Built target ortests
[ 93%] Linking CXX shared library libtorch_openreg.so
[ 93%] Built target torch_openreg
[ 96%] Building CXX object torch_openreg/csrc/CMakeFiles/torch_bindings.dir/Module.cpp.o
[100%] Linking CXX shared library libtorch_bindings.so
[100%] Built target torch_bindings
Install the project... |
albanD
left a comment
There was a problem hiding this comment.
Thanks for the updates. Looks greats!
|
Starting merge as part of PR stack under #160100 |
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg. **Changes:** - Add stream capabilities for OpenReg - Add event capabilities for OpenReg - Add kernel launch entrypoint for OpenReg - Add testcases about stream and event for OpenReg - Add example for OpenReg ghstack-source-id: 3c06c94 Pull-Request-resolved: #160099
|
Starting merge as part of PR stack under #160100 |
|
Starting merge as part of PR stack under #160100 |
As the title stated. Pull Request resolved: #161773 Approved by: https://github.com/albanD ghstack dependencies: #161603, #160099
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg. **Changes:** - Add stream capabilities for OpenReg - Add event capabilities for OpenReg - Add kernel launch entrypoint for OpenReg - Add testcases about stream and event for OpenReg - Add example for OpenReg Pull Request resolved: pytorch#160099 Approved by: https://github.com/albanD ghstack dependencies: pytorch#161603
As the title stated. Pull Request resolved: pytorch#161773 Approved by: https://github.com/albanD ghstack dependencies: pytorch#161603, pytorch#160099
…ytorch#160100) We integrated the openreg backend’s `Stream` and `Event` into PyTorch, all of which are similar to other accelerators like `CUDA`, `XPUs`, etc. Pull Request resolved: pytorch#160100 Approved by: https://github.com/albanD ghstack dependencies: pytorch#161603, pytorch#160099, pytorch#161773
Referring to the signatures and functions of `Stream` and `Event` in CUDA, we use CPU multithreading and conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg. **Changes:** - Add stream capabilities for OpenReg - Add event capabilities for OpenReg - Add kernel launch entrypoint for OpenReg - Add testcases about stream and event for OpenReg - Add example for OpenReg Pull Request resolved: pytorch#160099 Approved by: https://github.com/albanD ghstack dependencies: pytorch#161603
As the title stated. Pull Request resolved: pytorch#161773 Approved by: https://github.com/albanD ghstack dependencies: pytorch#161603, pytorch#160099
…ytorch#160100) We integrated the openreg backend’s `Stream` and `Event` into PyTorch, all of which are similar to other accelerators like `CUDA`, `XPUs`, etc. Pull Request resolved: pytorch#160100 Approved by: https://github.com/albanD ghstack dependencies: pytorch#161603, pytorch#160099, pytorch#161773
Stack from ghstack (oldest at bottom):
Referring to the signatures and functions of
StreamandEventin CUDA, we use CPU multithreadingand conditional variables to implement equivalent capabilities as the underlying foundation of torch_openreg.
Changes: