Improve robot state update by rhaschke · Pull Request #3548 · moveit/moveit

rhaschke · 2023-12-21T16:07:15Z

Inspired by the results of moveit/moveit2#2628, I changed matrix multiplication in RobotState updates from affine*matrix to matrix*matrix.

New benchmarks show that this is faster surprisingly (needs more operations, but can be vectorized better?)

benchmark	old	new	speed up
RobotStateBenchmark/update/10	0.007 ms	0.006 ms	+16.6 %
RobotStateBenchmark/update/100	0.068 ms	0.066 ms	+3.0 %
RobotStateBenchmark/update/1000	0.710 ms	0.668 ms	+6.2 %
RobotStateBenchmark/update/10000	10.4 ms	9.28 ms	+12.0 %

codecov · 2023-12-21T16:18:54Z

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (286f6b3) 62.17% compared to head (e47547a) 62.14%.

Files	Patch %	Lines
moveit_core/robot_state/src/robot_state.cpp	80.00%	2 Missing ⚠️
moveit_core/utils/src/robot_model_test_utils.cpp	66.67%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3548      +/-   ##
==========================================
- Coverage   62.17%   62.14%   -0.03%     
==========================================
  Files         385      385              
  Lines       34138    34139       +1     
==========================================
- Hits        21222    21211      -11     
- Misses      12916    12928      +12

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

New benchmarks show that this is faster surprisingly (needs more operations, but can be vectorized better?) benchmark | old | new ----------------------------------------|----------|---------- RobotStateBenchmark/update/10 | 0.007 ms | 0.006 ms RobotStateBenchmark/update/100 | 0.068 ms | 0.066 ms RobotStateBenchmark/update/1000 | 0.710 ms | 0.668 ms RobotStateBenchmark/update/10000 | 10.4 ms | 9.28 ms

marioprats

Good to know!
We should apply this to the ros2 branch as well!

v4hn · 2024-01-19T17:10:21Z

Currently I cannot reproduce these improvements consistently on my i5 laptop. If they are there, they drown in noise.

I built 838ec18 and 0d11ec9 respectively in release workspaces
ran both benchmarks via

$ ./devel/lib/moveit_core/robot_state_benchmark --benchmark_out=/tmp/robot_state_benchmark/matrix_**{matrix|affine}**.json --benchmark_filter=.*/update/.* --benchmark_repetitions=25

and see these analysis results on two different comparison runs (leaving out tests with pvalue > 0.05):

$ /usr/share/benchmark/compare.py benchmarks matrix_affine.json matrix_matrix.json | grep '_median\|pvalue'

Comparing matrix_affine.json to matrix_matrix.json
Benchmark                                                  Time             CPU      Time Old      Time New       CPU Old       CPU New
---------------------------------------------------------------------------------------------------------------------------------------
RobotStateBenchmark/update/10_pvalue                    0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/10_median                   +0.0690         +0.0690             0             0             0             0
RobotStateBenchmark/update/100_pvalue                   0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/100_median                  +0.0707         +0.0702             0             0             0             0
RobotStateBenchmark/update/1000_pvalue                  0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/1000_median                 +0.0491         +0.0491             1             1             1             1
RobotStateBenchmark/update/10000_pvalue                 0.0008          0.0008      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/10000_median                -0.0125         -0.0125             9             9             9             9

RobotStateBenchmark/update/10_pvalue                     0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/10_median                    +0.0708         +0.0707             0             0             0             0
-
RobotStateBenchmark/update/1000_pvalue                   0.0003          0.0003      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/1000_median                  -0.0091         -0.0092             1             1             1             1
-
RobotStateBenchmark/update/100000_pvalue                 0.0066          0.0070      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/100000_median                -0.0041         -0.0042           124           124           124           124

Any comment on what I might do to get your results @rhaschke ?

I very much like the google benchmark transition, but I don't think littering the code base with .matrix seems reasonable when the scale is so tiny and more strongly affected by other dynamics.

rhaschke · 2024-01-23T17:45:53Z

I built 838ec18 and 0d11ec9 respectively in release workspaces

I cannot find those commit hashes. Thus, I'm not sure what you compared. Can you please clarify?
I agree, that we should adapt the code only if there is significant performance improvement.

rhaschke · 2024-01-23T17:48:57Z

littering the code base with .matrix()

Note that I essentially replaced affine() with matrix(). So, no extra littering 😉

v4hn · 2024-01-24T09:47:54Z

I built 838ec18 and 0d11ec9 respectively in release workspaces

I cannot find those commit hashes. Thus, I'm not sure what you compared. Can you please clarify?
I agree, that we should adapt the code only if there is significant performance improvement.

Obviously you couldn't find the hashes. I tried to be complete but forgot I rebased your branch locally... 🤦
I refer to the last two commits here: https://github.com/v4hn/moveit/commits/improve-robot-state-update

rhaschke · 2024-02-06T17:54:31Z

Ok. I did the very same as you did: i) running the benchmarks on both commits and saving the results as .json, ii) comparing the results. On my i7 I get significant improvements throughout:

$ /usr/share/benchmark/compare.py benchmarks matrix_affine.json matrix_matrix.json | grep '_median\|pvalue'
RobotStateBenchmark/update/10_pvalue                    0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/10_median                   -0.6024         -0.6021             1             0             1             0
RobotStateBenchmark/update/100_pvalue                   0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/100_median                  -0.6030         -0.6028             9             3             9             3
RobotStateBenchmark/update/1000_pvalue                  0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/1000_median                 -0.6019         -0.6017            87            35            87            35
RobotStateBenchmark/update/10000_pvalue                 0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/10000_median                -0.6011         -0.6011           912           364           912           364

Maybe, this is a i5 vs. i7 optimization thing? On your i5, the differences are not significant most of the time.
Maybe my i7 has better optimization for 4x4 matrix assignment compared to 3x4 into 4x4 assignment (what is required by .affine()).

rhaschke · 2024-02-06T18:08:03Z

I have to correct myself: Initially I have build with RelWithDebugInfo. Repeating everything with a Release build I get completely different numbers (1-2% slowdown):

$ /usr/share/benchmark/compare.py benchmarks matrix_affine.json matrix_matrix.json | grep '_median\|pvalue'
RobotStateBenchmark/update/10_pvalue                    0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/10_median                   +0.0196         +0.0198             0             0             0             0
RobotStateBenchmark/update/100_pvalue                   0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/100_median                  +0.0142         +0.0143             0             0             0             0
RobotStateBenchmark/update/1000_pvalue                  0.0000          0.0000      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/1000_median                 +0.0181         +0.0183             1             1             1             1
RobotStateBenchmark/update/10000_pvalue                 0.0006          0.0006      U Test, Repetitions: 25 vs 25
RobotStateBenchmark/update/10000_median                -0.0104         -0.0104            10            10            10            10

I have no idea where this strong difference comes from. Obviously, optimization is strongly different in both cases.

rhaschke · 2024-02-06T18:12:13Z

Given the last result, I will close this PR.

rhaschke requested review from felixvd and mlautman as code owners December 21, 2023 16:07

rhaschke added 2 commits December 21, 2023 19:22

Benchmarking with Google benchmark

46ec6c4

rhaschke force-pushed the improve-robot-state-update branch from 135c779 to e47547a Compare December 21, 2023 18:22

tylerjw assigned marioprats Dec 21, 2023

marioprats mentioned this pull request Dec 22, 2023

Fix benchmarks moveit/moveit2#2628

Merged

marioprats approved these changes Dec 22, 2023

View reviewed changes

rhaschke closed this Feb 6, 2024

rhaschke deleted the improve-robot-state-update branch February 6, 2024 18:15

rhaschke restored the improve-robot-state-update branch February 6, 2024 18:17

rhaschke mentioned this pull request Feb 6, 2024

Benchmarking with Google benchmark #3565

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve robot state update#3548

Improve robot state update#3548
rhaschke wants to merge 2 commits intomoveit:masterfrom
ubi-agni:improve-robot-state-update

rhaschke commented Dec 21, 2023

Uh oh!

codecov bot commented Dec 21, 2023 •

edited

Loading

Uh oh!

marioprats left a comment

Uh oh!

v4hn commented Jan 19, 2024

Uh oh!

rhaschke commented Jan 23, 2024

Uh oh!

rhaschke commented Jan 23, 2024

Uh oh!

v4hn commented Jan 24, 2024

Uh oh!

rhaschke commented Feb 6, 2024

Uh oh!

rhaschke commented Feb 6, 2024

Uh oh!

rhaschke commented Feb 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rhaschke commented Dec 21, 2023

Uh oh!

codecov bot commented Dec 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

marioprats left a comment

Choose a reason for hiding this comment

Uh oh!

v4hn commented Jan 19, 2024

Uh oh!

rhaschke commented Jan 23, 2024

Uh oh!

rhaschke commented Jan 23, 2024

Uh oh!

v4hn commented Jan 24, 2024

Uh oh!

rhaschke commented Feb 6, 2024

Uh oh!

rhaschke commented Feb 6, 2024

Uh oh!

rhaschke commented Feb 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 21, 2023 •

edited

Loading