Skip to content

[caffe2] Lazily symbolize backtrace in c10::Error#125682

Closed
ot wants to merge 1 commit intopytorch:mainfrom
ot:export-D56586844
Closed

[caffe2] Lazily symbolize backtrace in c10::Error#125682
ot wants to merge 1 commit intopytorch:mainfrom
ot:export-D56586844

Conversation

@ot
Copy link
Contributor

@ot ot commented May 7, 2024

Summary:
The macros that build c10::Error compute the stack trace at the point of throwing, which is then returned as part of the what(). If what() is never called, which is the case for most exceptions (since logging is throttled), the cost of computing the stack trace was wasted.

By far, the most expensive part of computing the stack trace is its symbolization; just unwinding the stack and collecting the instruction addresses is comparatively cheap. We can thus defer the symbolization to first invocation of what().

Test Plan:
Added unit tests exercising the lazy nature of what().

Ran an adfinder canary: https://www.internalfb.com/intern/ads/canary/460118801509424346

We can see that the cost of symbolization is obliterated (meaning that what() is virtually never called, as expected):
{F1496627896}

Reviewed By: ezyang

Differential Revision: D56586844

@pytorch-bot
Copy link

pytorch-bot bot commented May 7, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/125682

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit e6ca7e5 with merge base 848fce3 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented May 7, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: ot / name: Giuseppe Ottaviano (e6ca7e5)

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D56586844

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D56586844

@ot ot force-pushed the export-D56586844 branch from b992564 to 98fcaef Compare May 7, 2024 14:27
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D56586844

Summary:

The macros that build `c10::Error` compute the stack trace at the point of throwing, which is then returned as part of the `what()`. If `what()` is never called, which is the case for most exceptions (since logging is throttled), the cost of computing the stack trace was wasted.

By far, the most expensive part of computing the stack trace is its symbolization; just unwinding the stack and collecting the instruction addresses is comparatively cheap. We can thus defer the symbolization to first invocation of `what()`.

Test Plan:
Added unit tests exercising the lazy nature of `what()`.

Ran an adfinder canary: https://www.internalfb.com/intern/ads/canary/460118801509424346

We can see that the cost of symbolization is obliterated (meaning that `what()` is virtually never called, as expected):
 {F1496627896}

Reviewed By: ezyang

Differential Revision: D56586844
@ot ot force-pushed the export-D56586844 branch from 98fcaef to e6ca7e5 Compare May 7, 2024 16:22
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D56586844

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 7, 2024
@facebook-github-bot
Copy link
Contributor

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

ot added a commit to ot/pytorch that referenced this pull request May 8, 2024
Summary:

pytorch#125682 (D56586844) added support for lazy symbolization to `Error` and adopted it for internal use cases; this commit adopts it for `get_backtrace()` as well.

Test Plan: Sandcastle and GH CI.

Differential Revision: D56881683
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 8, 2024

@pytorchbot revert -m="The test added fails on windows" -c="ignoredsignal"

This Pull Request has been reverted by a revert inside Meta. To re-land this change, please open another pull request, assign the same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).)

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@ot your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request May 8, 2024
This reverts commit 08f6ef0.

Reverted #125682 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](#125682 (comment)))
pytorchmergebot pushed a commit that referenced this pull request May 10, 2024
Summary: #125682 (D56586844) added support for lazy symbolization to `Error` and adopted it for internal use cases; this commit adopts it for `get_backtrace()` as well.

Test Plan: Sandcastle and GH CI.

Differential Revision: D56881683

Pull Request resolved: #125750
Approved by: https://github.com/ezyang
ot added a commit to ot/pytorch that referenced this pull request May 13, 2024
Summary:


pytorch#125682 (D56586844) added support for lazy symbolization to `Error` and adopted it for internal use cases; this commit adopts it for `get_backtrace()` as well.

Test Plan:
Sandcastle and GH CI.

NOTE: This is a resubmit of D56881683, a spurious copypasted line in the Android implementation broke the build, but this was not surfaced by diff tests.

Reproed the breakage with
```
$ fbpython scripts/build_android_app/build_android_app.py --buck-config-files='@//fbandroid/mode/have_libgflags @//fbandroid/mode/static_linking @//xplat/langtech/mobile/android_opt_buck_config_with_et_boltnn' --build-target='fbsource//xplat/langtech/mobile:transcribe_binAndroid-android-arm64'
```
Verified that the fixed diff builds successfully.

Differential Revision: D57275456
pytorchmergebot pushed a commit that referenced this pull request May 13, 2024
…6064)

Summary:

#125682 (D56586844) added support for lazy symbolization to `Error` and adopted it for internal use cases; this commit adopts it for `get_backtrace()` as well.

Test Plan:
Sandcastle and GH CI.

NOTE: This is a resubmit of D56881683, a spurious copypasted line in the Android implementation broke the build, but this was not surfaced by diff tests.

Reproed the breakage with
```
$ fbpython scripts/build_android_app/build_android_app.py --buck-config-files='@//fbandroid/mode/have_libgflags @//fbandroid/mode/static_linking @//xplat/langtech/mobile/android_opt_buck_config_with_et_boltnn' --build-target='fbsource//xplat/langtech/mobile:transcribe_binAndroid-android-arm64'
```
Verified that the fixed diff builds successfully.

Differential Revision: D57275456

Pull Request resolved: #126064
Approved by: https://github.com/ezyang
tinglvv pushed a commit to tinglvv/pytorch that referenced this pull request May 14, 2024
… (pytorch#126064)

Summary:

pytorch#125682 (D56586844) added support for lazy symbolization to `Error` and adopted it for internal use cases; this commit adopts it for `get_backtrace()` as well.

Test Plan:
Sandcastle and GH CI.

NOTE: This is a resubmit of D56881683, a spurious copypasted line in the Android implementation broke the build, but this was not surfaced by diff tests.

Reproed the breakage with
```
$ fbpython scripts/build_android_app/build_android_app.py --buck-config-files='@//fbandroid/mode/have_libgflags @//fbandroid/mode/static_linking @//xplat/langtech/mobile/android_opt_buck_config_with_et_boltnn' --build-target='fbsource//xplat/langtech/mobile:transcribe_binAndroid-android-arm64'
```
Verified that the fixed diff builds successfully.

Differential Revision: D57275456

Pull Request resolved: pytorch#126064
Approved by: https://github.com/ezyang
@ot
Copy link
Contributor Author

ot commented Jun 25, 2024

Resubmitted and merged as #125787, closing this one.

@ot ot closed this Jun 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged Reverted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants