Race conditions when updating PlanningScene: fixup #716#728
Race conditions when updating PlanningScene: fixup #716#728rhaschke wants to merge 8 commits intoindigo-develfrom
Conversation
from MoveGroupInterface (client side) to MoveGroup's ExecutionServiceCapability (server side) This checks validity of plan no matter which client called. Placing it in MoveGroupContext will allow re-use by other capabilities.
Primary objective of syncSceneUpdates() is to receive a recent robot state. If a current state monitor is active, this is all to monitor. If not, scene updates are monitored, in particular the timestamp of their RobotState member.
- new CurrentStateMonitor::waitForCurrentState(ros::Time) - simply wait for state update to reach scene - fixed validatePlan(): use new CSM::waitForCurrentState()
If the robot doesn't move, an update in CSM is not forwarded to the planning scene. Hence, we always wait until timeout for a recent last_robot_motion_time_. If we ensure that an update, if it is triggered by CSM, is directly passed to the scene, we can early return true. This partially reverts commit 907980a392d352f5e2ebb6a0b6906c85d9c7d72c.
|
I think the integration tests make this PR way too big - should be separate as discussion on call |
new syncSceneUpdates() indeed also waits for all pending scene updates to be processed
| #define MOVEIT_MOVE_GROUP_CONTEXT_ | ||
|
|
||
| #include <moveit/macros/class_forward.h> | ||
| #include <trajectory_msgs/JointTrajectory.h> |
There was a problem hiding this comment.
please add the new dependency to the package.xml
|
I'm afraid that I cannot handle the comments before next week. Please be patient. |
|
Hopefully before August 5th... |
|
@davetcoleman , @130s, @mikeferguson / whoever feels responsible. This is a bit critical, so please read and comment. @davetcoleman this is not going to happen before Friday, He will be really pissed because of that, but at the moment I'm considering a revert of the whole list of commits for #442 up to now. Maybe that's what we should have done directly after the unreviewed merge... Overall, this is no state in which we should let people work on "minor" issues and the whole "fix the start state" patch is clearly not ready for a release. I therefore propose to revert these change-sets and try to add only the part that makes the ExecutionController check whether the trajectory it is about to execute starts somewhere around the current state. This should be possible without most of this code and without breakage. |
|
Yes, after seeing #736 and possibly related moveit/moveit#10 I have been worried about the state of the PlanningSceneMonitor. Considering it was never fully reviewed I do not think we should feel guilty about the revert. +1 |
|
+1 for the revert. Ideally before the merge and unless it is shown that #736 is not an issue in general or independent from this, we cannot release. |
|
Looks like AsyncSpinner isn't really usable as intended. As I can't work on the issue in the next days, go for the revert as suggested by @v4hn. |
|
@v4hn can you do the revert today before tomorrow's merge? |
|
I opened a request to revert the previous merge commits in #742 . I'll close this request. We will have to look through the changes again anyway and rebase/reformat them for the merged repo. |
|
@v4hn @dornhege @davetcoleman Obviously any use of an I will try to fix that issue upstream in |
|
On Mon, Aug 15, 2016 at 12:32:20AM -0700, Robert Haschke wrote:
Thanks! This is clearly a bug in ros_comm and the commit that introduced the global mutex there is at fault. |
|
Welcome back @rhaschke, hope vacation was good! Thanks for continuing to work on this |
|
The other usage of I prepared ros/ros_comm#867 and hope that they will consider it timely. |
|
Wow, given the scope of your patch @rhaschke I'm pretty sure this will not be merged in indigo, not sure about kinetic. But we should probably fix the safety issue in indigo... |
This continues #716 and #724 which were both merged already. I repeat here my comments to #724:
@v4hn Realizing the simplification we agreed on in the phone call turned out to be rather difficult:
stateUpdateTimerCallback()(removing the additional timestamp check) caused dead locks. Hence, I reverted these changes.validatePlan()fromMoveGroupInterface(client side) toMoveGroup'sExecutionServiceCapability(server side) - as agreed.Simplification of syncSceneUpdates(): We definitely need both while loops and the timeout. When the robot doesn't move at all, we will never receive a scene update. The first loop is required to check for a recent robot state update directly in CSM, while the second loop (in case there is no CSM) waits - with a timeout - for a direct scene update. There is no way to omit this timeout. The only chance would be to trigger all input channels of PSM to send updates - which is not possible. However, inmove_group, this doesn't pose a problem, because there is a CSM instantiated.syncSceneUpdates()after trajectory execution makes sense. Doing so, led to failure of the test suggested in Trajectory start doesn't match Current Position, when using plan/execute #442. Now I fixed it.CSM::waitForCurrentState()didn't do what I expected from the name...Some more thoughts:
syncSceneUpdates()should be renamed towaitForCurrentRobotState().waitForCurrentState()inCurrentStateMonitor, which actually should be renamed towaitForCompleteState()as they don't look at recent timestamps.syncSceneUpdates()which incorporates all pending scene updates in the callback queue. I agree to @v4hn that this doesn't guarantee that all scene updates up to a given timestamp are incorporated, because - due to network delays - ROS messages might be received delayed. The only guarantee for a synchronous scene update is the new methodapplyPlanningScene().