[PyTorch] Pass TensorOptions by value#51165
Closed
swolchok wants to merge 5 commits intogh/swolchok/93/basefrom
Closed
[PyTorch] Pass TensorOptions by value#51165swolchok wants to merge 5 commits intogh/swolchok/93/basefrom
swolchok wants to merge 5 commits intogh/swolchok/93/basefrom
Conversation
`TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. Differential Revision: [D25983863](https://our.internmc.facebook.com/intern/diff/D25983863/) [ghstack-poisoned]
Contributor
💊 CI failures summary and remediationsAs of commit 6eccd63 (more details on the Dr. CI page):
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
This was referenced Jan 27, 2021
swolchok
added a commit
that referenced
this pull request
Jan 27, 2021
`TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. Differential Revision: [D25983863](https://our.internmc.facebook.com/intern/diff/D25983863/) ghstack-source-id: 120446690 Pull Request resolved: #51165
ezyang
approved these changes
Jan 27, 2021
`TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. Differential Revision: [D25983863](https://our.internmc.facebook.com/intern/diff/D25983863/) [ghstack-poisoned]
This was referenced Jan 27, 2021
… for Functions.cpp by over 15% on "[PyTorch] Pass TensorOptions by value" `TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. Differential Revision: [D25983863](https://our.internmc.facebook.com/intern/diff/D25983863/) [ghstack-poisoned]
swolchok
added a commit
that referenced
this pull request
Jan 28, 2021
Pull Request resolved: #51165 `TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. ghstack-source-id: 120552359 Differential Revision: [D25983863](https://our.internmc.facebook.com/intern/diff/D25983863/)
This was referenced Jan 28, 2021
`TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. Differential Revision: [D25983863](https://our.internmc.facebook.com/intern/diff/D25983863/) [ghstack-poisoned]
This was referenced Jan 29, 2021
`TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. Differential Revision: [D25983863](https://our.internmc.facebook.com/intern/diff/D25983863/) [ghstack-poisoned]
Contributor
|
This pull request has been merged in 4495b49. |
laurentdupin
pushed a commit
to laurentdupin/pytorch
that referenced
this pull request
Apr 24, 2026
Summary: Pull Request resolved: pytorch#51165 `TensorOptions` does not have a non-trivial copy, move, or destroy operation and is small enough to fit in a register, so it seems like we should pass it by value. ghstack-source-id: 120697498 Test Plan: Measured timing for empty framework overhead benchmark before & after this change: Before: ``` I0126 16:02:50.662864 2137574 bench.cpp:139] Mean 0.268645 I0126 16:02:50.662891 2137574 bench.cpp:140] Median 0.267485 I0126 16:02:50.662896 2137574 bench.cpp:141] Min 0.266485 I0126 16:02:50.662901 2137574 bench.cpp:142] stddev 0.00219359 I0126 16:02:50.662915 2137574 bench.cpp:143] stddev / mean 0.00816537 2,968.37 msec task-clock # 0.997 CPUs utilized ( +- 0.03% ) 250 context-switches # 0.084 K/sec ( +- 2.21% ) 1 cpu-migrations # 0.000 K/sec 11,403 page-faults # 0.004 M/sec ( +- 0.28% ) 5,898,481,882 cycles # 1.987 GHz ( +- 0.03% ) (50.05%) 16,169,242,938 instructions # 2.74 insn per cycle ( +- 0.03% ) (50.06%) 3,076,546,626 branches # 1036.443 M/sec ( +- 0.05% ) (50.05%) 2,531,859 branch-misses # 0.08% of all branches ( +- 0.89% ) (50.03%) ``` After: ``` I0126 16:23:20.010062 2244624 bench.cpp:139] Mean 0.266814 I0126 16:23:20.010092 2244624 bench.cpp:140] Median 0.265759 I0126 16:23:20.010099 2244624 bench.cpp:141] Min 0.260291 I0126 16:23:20.010107 2244624 bench.cpp:142] stddev 0.00548279 I0126 16:23:20.010118 2244624 bench.cpp:143] stddev / mean 0.0205491 2,983.75 msec task-clock # 0.995 CPUs utilized ( +- 0.36% ) 243 context-switches # 0.082 K/sec ( +- 1.26% ) 1 cpu-migrations # 0.000 K/sec 11,422 page-faults # 0.004 M/sec ( +- 0.18% ) 5,928,639,486 cycles # 1.987 GHz ( +- 0.36% ) (50.02%) 16,105,928,210 instructions # 2.72 insn per cycle ( +- 0.05% ) (50.02%) 3,150,273,453 branches # 1055.809 M/sec ( +- 0.03% ) (50.05%) 3,713,617 branch-misses # 0.12% of all branches ( +- 0.83% ) (50.07%) ``` It looked close to neutral, so I used `perf stat` to confirm it's about a 1% instruction count win. For deciding whether this stack is worth it, I went back and ran `perf stat` on the baseline diff before I started touching the dispatcher: ``` 2,968.37 msec task-clock # 0.997 CPUs utilized ( +- 0.03% ) 250 context-switches # 0.084 K/sec ( +- 2.21% ) 1 cpu-migrations # 0.000 K/sec 11,403 page-faults # 0.004 M/sec ( +- 0.28% ) 5,898,481,882 cycles # 1.987 GHz ( +- 0.03% ) (50.05%) 16,169,242,938 instructions # 2.74 insn per cycle ( +- 0.03% ) (50.06%) 3,076,546,626 branches # 1036.443 M/sec ( +- 0.05% ) (50.05%) 2,531,859 branch-misses # 0.08% of all branches ( +- 0.89% ) (50.03%) ``` If I've done the arithmetic correctly, we have an 0.39% instruction count win. Reviewed By: ezyang Differential Revision: D25983863 fbshipit-source-id: 87d1451a01ead25738ea6b80db270d344bc583b2
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack:
TensorOptionsdoes not have a non-trivial copy, move, ordestroy operation and is small enough to fit in a register, so it
seems like we should pass it by value.
Differential Revision: D25983863