[dist] expose unsafe_get_ptr for dist.ProcessGroupNCCL.NCCLConfig#161136
[dist] expose unsafe_get_ptr for dist.ProcessGroupNCCL.NCCLConfig#161136youkaichao wants to merge 6 commits intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161136
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 8f1a9e5 with merge base 7f201ba ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
kwen2501
left a comment
There was a problem hiding this comment.
Btw, in Python (in particular CPython), you can always get an address of a python object by the built-in id(obj) method.
torch/csrc/distributed/c10d/init.cpp
Outdated
| .def_readwrite("nvls_ctas", &ncclConfig_t::nvlsCTAs) | ||
| #endif | ||
| .def_property_readonly( | ||
| "ptr", |
There was a problem hiding this comment.
Do you mind naming it as unsafe_get_ptr? The reason is that in line 3382 above the object is created via std::make_unique thus this object could disappear after one gets the inner pointer. (With the rename, this entry would probably look more like a method than a property.)
torch/csrc/distributed/c10d/init.cpp
Outdated
| .def_property_readonly( | ||
| "ptr", | ||
| [](const ncclConfig_t& self) { | ||
| return reinterpret_cast<int64_t>(&self); |
There was a problem hiding this comment.
Use uintptr_t instead of int64_t?
torch/_C/_distributed_c10d.pyi
Outdated
| cga_cluster_size: int | ||
| min_ctas: int | ||
| max_ctas: int | ||
| ptr: int |
There was a problem hiding this comment.
No need if we make it a unsafe_get_ptr method instead of property.
|
@kwen2501 updated according to your comments. |
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…torch#161136) expose the pointer so that we can create the `ncclConfig_t` object from pytorch and use it elsewhere. this is useful to control the nccl communicator parameters for multiple nccl communicators. Pull Request resolved: pytorch#161136 Approved by: https://github.com/kwen2501
expose the pointer so that we can create the
ncclConfig_tobject from pytorch and use it elsewhere. this is useful to control the nccl communicator parameters for multiple nccl communicators.cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta