PSM: Fix handling of flag state_update_pending_#3682
Conversation
|
@rr-mark: Could you please test and review as well? |
|
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #3682 +/- ##
===========================================
+ Coverage 0.00% 47.46% +47.46%
===========================================
Files 600 604 +4
Lines 59271 61629 +2358
Branches 7031 7034 +3
===========================================
+ Hits 0 29247 +29247
+ Misses 59271 31964 -27307
- Partials 0 418 +418 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
I looked at this for quite a while now and am still not entirely convinced it works as it should with this patch (nor did I analyze the previous code in detail).
Key insights:
onStateUpdate, all direct ROS callbacks the CSM and PSM use, andstateUpdateTimerCallbackare all served from a single threadedAsyncSpinner. Thus they are effectively serialized and don't need a mutex at all.- the offenders which might be called from other threads are
setStateUpdateFrequency- it is not used anywhere to the best of my knowledge and is a bad concept from my perspective - as one could just require stopping the monitor to change the rate
- but it is a public API
- this requires protecting
dt_state_update_
waitForCurrentState- which users likely regularly, but not excessively, call
- this can call (the equally public API)
updateSceneWithCurrentStatefrom outside the spinner and can thus affectlast_robot_state_update_wall_time_, requiring the mutex.
moveit_ros/planning/planning_scene_monitor/src/planning_scene_monitor.cpp
Outdated
Show resolved
Hide resolved
moveit_ros/planning/planning_scene_monitor/src/planning_scene_monitor.cpp
Outdated
Show resolved
Hide resolved
moveit_ros/planning/planning_scene_monitor/src/planning_scene_monitor.cpp
Outdated
Show resolved
Hide resolved
moveit_ros/planning/planning_scene_monitor/src/planning_scene_monitor.cpp
Outdated
Show resolved
Hide resolved
...anning/planning_scene_monitor/include/moveit/planning_scene_monitor/planning_scene_monitor.h
Outdated
Show resolved
Hide resolved
Can do. I'm busy this morning, but should get to moveit things this afternoon (UK time). Should I wait for this PR to go through before porting my PR to moveit2, or do you think it makes more sense to port them separately? |
@rr-mark Yes, you should wait for final approval / merging of this PR before tackling the ROS2 port. |
There was a problem hiding this comment.
Looks good to me, and seems to work on our system in the brief testing I've done. When it's merged I'll put it through our stress tests, which will hopefully throw up any problems in real usage.
My only suggestion (likely for future work rather than this PR) is it might be helpful to factor out all the checks to when updateSceneWithCurrentState() should be called (checking state_update_pending_ and/or now() - last_update_ >= dt_state_update_) into a single method, so it's obvious why which checks are happening when.
|
I will fix the formatting issue by upgrading to clang-format-12 or later after my current meeting. |
e7d6eab to
d91eb77
Compare
... in onStateUpdate() - where it is easily possible. Reading it in stateUpdateTimerCallback() is unprotected. However, missing an update or performing an extra one, because the variable is corrupted, is not a big deal.
Co-authored-by: Michael Görner <me@v4hn.de>
While there are a few other locks except explicit user locks (getPlanningSceneServiceCallback(), collisionObjectCallback(), attachObjectCallback(), newPlanningSceneCallback(), and scenePublishingThread()), these occur rather seldom (scenePublishingThread() publishes at 2Hz). Hence, we might indeed balance a non-blocking CSM vs. missed PS updates in favour of CSM. Co-authored-by: Michael Görner <me@v4hn.de>
The timer callback and CSM's state update callbacks are served from the same callback queue, which would block CSM again. Co-authored-by: Michael Görner <me@v4hn.de>
As @rhaschke suggested, reading dt_state_update_ and last_robot_state_update_wall_time_ does not lead to logic errors, but at most to a skipped or redundant update on corrupted data. Alternatively we could be on the safe side and turn both variables into std::atomic, but that would effectively mean locks on every read. Instead, only set state_update_pending_ as an atomic, which is lockfree in this case.
d91eb77 to
2aaa516
Compare
|
@v4hn Can you give this a final review, please? |
|
Thanks for these fixes. I'll look into porting this PR and my PR to moveit2 next week. |
It's looking like this will happen next week now. |
* Move update of state_update_pending_ to updateSceneWithCurrentState() * Revert to try_lock While there are a few other locks except explicit user locks (getPlanningSceneServiceCallback(), collisionObjectCallback(), attachObjectCallback(), newPlanningSceneCallback(), and scenePublishingThread()), these occur rather seldom (scenePublishingThread() publishes at 2Hz). Hence, we might indeed balance a non-blocking CSM vs. missed PS updates in favour of CSM. * Don't block for scene update from stateUpdateTimerCallback too The timer callback and CSM's state update callbacks are served from the same callback queue, which would block CSM again. * further locking adaptations reading dt_state_update_ and last_robot_state_update_wall_time_ does not lead to logic errors, but at most to a skipped or redundant update on corrupted data. Alternatively we could be on the safe side and turn both variables into std::atomic, but that would effectively mean locks on every read. Instead, only set state_update_pending_ as an atomic, which is lockfree in this case. Co-authored-by: Michael Görner <me@v4hn.de>
Follow up on #3676, implementing @v4hn's suggestion to update
state_update_pending_andlast_robot_state_update_wall_time_only within updateSceneWithCurrentState() - where the update is actually performed.This nicely simplifies all the code 🎉