Skip to content

Change ATEN generator argument type to const std::optional<Generator>&#120076

Open
cyyever wants to merge 4 commits intopytorch:mainfrom
cyyever:const_generator
Open

Change ATEN generator argument type to const std::optional<Generator>&#120076
cyyever wants to merge 4 commits intopytorch:mainfrom
cyyever:const_generator

Conversation

@cyyever
Copy link
Copy Markdown
Collaborator

@cyyever cyyever commented Feb 16, 2024

This PR proposes to use std::optional& for underlying functions to avoid unnecessary copy and move operations. The torchgen code was changed to generate the new type.

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Feb 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120076

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 231c64d with merge base 52e9049 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot Bot added ciflow/mps Run MPS tests (subset of trunk) release notes: mps Release notes category labels Feb 16, 2024
@cyyever cyyever marked this pull request as draft February 16, 2024 12:53
@github-actions github-actions Bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label Feb 16, 2024
@cyyever cyyever force-pushed the const_generator branch 3 times, most recently from 3518e8c to 9282a25 Compare February 16, 2024 13:01
@cyyever cyyever force-pushed the const_generator branch 6 times, most recently from dc335cc to 0cd72e6 Compare February 16, 2024 15:02
@cyyever cyyever changed the title Change interfaces to use const c10::optional<Generator>& Change aten generator arguments to use const c10::optional<Generator>& Feb 16, 2024
@cyyever cyyever changed the title Change aten generator arguments to use const c10::optional<Generator>& Change ATEN generator argument type to const c10::optional<Generator>& Feb 17, 2024
@cyyever cyyever marked this pull request as ready for review February 17, 2024 01:58
@cyyever cyyever requested a review from ezyang February 17, 2024 01:58
@cyyever cyyever marked this pull request as draft February 17, 2024 03:48
@cyyever cyyever marked this pull request as ready for review February 17, 2024 15:05
@soulitzer soulitzer added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 20, 2024
@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Feb 20, 2024

This is still not an optimal type for Generator, because if you have a Generator and you need to create a const optional<Generator>&, you still need to do a copy assignment into the optional (because the optional is designated as owning the Generator it has internally).

We have an analogous problem with optional tensors, and ultimately we had to do a trick with OptionalTensorRef to get the efficient version. If we're going through all the trouble of changing the API here, we probably should do this here too.

Copy link
Copy Markdown
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but as you change this code anyway, do you mind using std::optional (as c10::optional is a soon to be deprecated alias)

@jeanschmidt
Copy link
Copy Markdown
Contributor

jeanschmidt commented Mar 22, 2024

@pytorchbot revert -m "Reverting in order to check if this will fix XLA trunk jobs" -c ignoredsignal

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Mar 22, 2024
…nerator>& (#120076)"

This reverts commit ecbe82b.

Reverted #120076 on behalf of https://github.com/jeanschmidt due to Reverting in order to check if this will fix XLA trunk jobs ([comment](#120076 (comment)))
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@cyyever your PR has been successfully reverted.

@cyyever
Copy link
Copy Markdown
Collaborator Author

cyyever commented Mar 24, 2024

@pytorchmergebot merge -i

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

Merge started

Your change will be merged while ignoring the following 2 checks: linux-binary-conda / conda-py3_10-cuda11_8-build / build, trunk / win-vs2019-cpu-py3 / test (default, 1, 3, windows.4xlarge.nonephemeral)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Mar 24, 2024
…p.py (#121415)

This PR is a follow-up of #120076, it moves std::optional<Generator> detection logic into  ```valuetype_type``` of api/cpp.py by adding the mutable parameter, which facilitates future value type changes.

Pull Request resolved: #121415
Approved by: https://github.com/ezyang
@atalman
Copy link
Copy Markdown
Contributor

atalman commented Mar 25, 2024

HI This fails internally with error:

stderr:
terminate called after throwing an instance of 'c10::Error'
  what():  
Mismatch in kernel C++ signatures
  operator: aten::random_.from(Tensor(a!) self, int from, int? to, *, Generator? generator=None) -> Tensor(a!)
    registered at buck-out/v2/gen/fbcode/45fe2882ccb6f7da/caffe2/__gen_aten__/out/RegisterSchema.cpp:6
  kernel 1: at::Tensor& (at::Tensor&, long, std::optional<long>, std::optional<at::Generator> const&)
    dispatch key: VmapMode
    registered at fbcode/caffe2/aten/src/ATen/VmapModeRegistrations.cpp:37
  kernel 2: at::Tensor& (at::Tensor&, long, std::optional<long>, std::optional<at::Generator>)
    dispatch key: CustomRNGKeyId
    registered at fbcode/pytorch/csprng/torchcsprng/csrc/csprng.cpp:344

Exception raised from registerKernel at fbcode/caffe2/aten/src/ATen/core/dispatch/OperatorEntry.cpp:130 (most recent call first):
# 0  c10::get_backtrace[abi:cxx11](unsigned long, unsigned long, bool)
# 1  0x00000000246579dc
# 2  c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
# 3  c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
# 4  c10::impl::OperatorEntry::registerKernel(c10::Dispatcher const&, std::optional<c10::DispatchKey>, c10::KernelFunction, std::optional<c10::impl::CppSignature>, std::unique_ptr<c10::FunctionSchema, std::default_delete<c10::FunctionSchema> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
# 5  c10::Dispatcher::registerImpl(c10::OperatorName, std::optional<c10::DispatchKey>, c10::KernelFunction, std::optional<c10::impl::CppSignature>, std::unique_ptr<c10::FunctionSchema, std::default_delete<c10::FunctionSchema> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
# 6  torch::Library::_impl(char const*, torch::CppFunction&&, torch::_RegisterOrVerify) &
# 7  0x000000001fcae596
# 8  _ZN5torch6detail16TorchLibraryInitC2ENS_7Library4KindEPFvRS2_EPKcSt8optionalIN3c1011DispatchKeyEES8_j_torchcsprng$_C
# 9  0x000000001fcc4738
# 10 __libc_start_main_alias_2
# 11 _start


exit_code:"Process terminated by the signal: 6"

Pointing to this csprng : https://github.com/pytorch/csprng/blob/main/torchcsprng/csrc/csprng.cpp#L344

@atalman
Copy link
Copy Markdown
Contributor

atalman commented Mar 25, 2024

@pytorchbot revert -m "breaking internal builds" -c ghfirst

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot added a commit that referenced this pull request Mar 25, 2024
…nerator>& (#120076)"

This reverts commit a52b4e2.

Reverted #120076 on behalf of https://github.com/atalman due to breaking internal builds ([comment](#120076 (comment)))
@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@cyyever your PR has been successfully reverted.

@cyyever
Copy link
Copy Markdown
Collaborator Author

cyyever commented Mar 26, 2024

@alanwaketan @ezyang https://github.com/pytorch/csprng should be updated for this PR, how can we do that because we didn't have a coordination protocol like PyTorch/XLA?

@alanwaketan
Copy link
Copy Markdown
Collaborator

@cyyever I guess I will need to revert your change in PyTorch/XLA for now?

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Mar 26, 2024

@cyyever At this point, you might have to just take the long way around and add BC logic that can handle the old overload types. Fortunately, it should be possible to detect them from C++ metaprogramming and then you can transparently shim them to the new type.

If you still want to try landing directly again, we will need to make this an fbcode first diff, as the csprng will need to be landed simultaneously to fbcode. That means someone at Meta has to volunteer to shepherd the rest of the patch.

@github-actions
Copy link
Copy Markdown
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/inductor ciflow/mps Run MPS tests (subset of trunk) ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) no-stale open source release notes: mps Release notes category Reverted triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.