-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-16280: [C++] Avoid copying shared_ptr in Expression::type() #12957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note there's some lint failures (though, the git version means a lot of pipelines are broken) |
oops you're right. should be fixed now |
|
CI should pass if you rebase here |
9c10b3f to
9e3a56d
Compare
9e3a56d to
de6c654
Compare
|
Thanks for catching this! |
|
Benchmark runs are scheduled for baseline = e1e782a and contender = e3b84b0. e3b84b0 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
|
['Python', 'R'] benchmarks have high level of regressions. |
|
looks like the "high level of regressions" are a temporary glich |
|
Now a few more days have passed and a longer timeline is visible, it doesn't seem to be a temporary change: https://conbench.ursa.dev/compare/runs/8b8a13ad985c4adfbf68518fb9fca07d...d0967da307ff4694b471d4f131c5e2b8/ |
|
Interesting. It seems like this commit (which seems unrelated to me) made the graph temporarily go back to normal, making it look like a glitch. This is probably what I saw when writing the above comment. For example here: I guess I'll try if I can reproduce any of that |
|
I now run ran one of the most-affected benchmarks - tpch, arrow, parquet, memory_map=False, TPCH-15, scale_factor=1, R - and saw no such difference on my machine. (provided R picked up the changed C++ library correctly, I'm not very familar with it) Going trough the benchmark runner results a bit more I noticed that all the extreme permanent changes are R TCPH benchmarks. For example this and this are the worst affected Python and JS benchmarks. Also some TCPH benchmarks see an extreme improvement. Overall this gives me the impression the performance change may be unrelated to this commit. I wonder if it is possible to run the baseline again on conbench to isolate this? |
…rove performance (#13222) According to conbench there was a slight regression on #12957 . Poking around a bit it seems that a static local variable is implemented using some kind of global lock (__cxa_guard_acquire / __cxa_guard_release). On the other hand, constructing an empty shared_ptr is cheap (two pointers are set to 0). So if we care about performance here we probably don't want `static`. This may be what is causing the conbench issue. Lead-authored-by: Weston Pace <weston.pace@gmail.com> Co-authored-by: Antoine Pitrou <antoine@python.org> Signed-off-by: Yibo Cai <yibo.cai@arm.com>
I see big improvements in the
ExecuteScalarExpressionOverheadbenchmarks, especiallyref_only_expressionwith this, for example:Also the overhead of small batch size/multithreaded benchmarks is reduced, for example in
complex_expression: