Conversation
Codecov ReportPatch coverage has no change and project coverage change:
Additional details and impacted files@@ Coverage Diff @@
## main #2130 +/- ##
==========================================
+ Coverage 50.48% 50.48% +0.01%
==========================================
Files 387 387
Lines 31785 31790 +5
==========================================
+ Hits 16044 16047 +3
- Misses 15741 15743 +2
☔ View full report in Codecov by Sentry. |
| // auto planner_map_it = planners_.find(new_name); | ||
| // if (planner_map_it != planners_.end()) | ||
| // { | ||
| // return planner_map_it->second; |
There was a problem hiding this comment.
Is ->second a planning scene? I wonder if it just needs to be locked
There's an example of locking here: https://github.com/ros-planning/moveit2/blob/f432978571816eecdef8cc24e55260661ac049cd/moveit_ros/planning/planning_scene_monitor/include/moveit/planning_scene_monitor/planning_scene_monitor.h#L634
There was a problem hiding this comment.
->second is a ompl::base::PlannerPtr that is created on this line by passing it the SpaceInformationPtr that is created here and from my understanding does not include the planning_scene. It does use a robot_model which I think comes from that locked scene so that could be the issue!
We do create a locked planning_scene on each planning request here but it gets unlocked after the goal succeeds.
|
Please see my comment here #2103 (comment) |
|
If this fixes the crash, then this seems like a useful workaround that should be merged. Recreating a planner instance from planner data adds a bit of overhead, but shouldn't be too bad. I'd add a comment along these lines so that we don't forget why this is done: // FIXME: make reusing PlannerPtr not crash, so that we don't have to reconstruct a PlannerPtr instance |
| { | ||
| return planner_map_it->second; | ||
| ob::PlannerData data(si); | ||
| planner_map_it->second->getPlannerData(data); |
There was a problem hiding this comment.
I'm confused now. How can we access the planner data if the planner instance is invalid? Or are there actually any members inside the planner instance that go out of scope? This version looks like it should produce the same segfaults
There was a problem hiding this comment.
If you look at the stack traces I uploaded to the issue it faults using the planning_context here which contains the planner and a bunch of other stuff. Some of its objects go out of scope the but the planner data does not.
planning_context_->getOMPLStateSpace()->copyToRobotState(*robot_state, wrapped_state);
henningkayser
left a comment
There was a problem hiding this comment.
I'm fine with merging this as a temporary solution. The overhead of creating a new planner instance is really not too much of an issue, but we should keep on looking into the underlying cause anyway.
|
Just so everyone knows, I have done some profiling and re-constructing the planner is taking a little under 0.2 seconds which is not ideal for application that are going for fast cycle times! I have also found that re-using the planning graph does not speed up LazyPRM. If you give it the exact same problem to solve over and over again the graph will increase in size with each attempt but the planning time always uses the max allowed set by the user. This could be an error with my configuration so I will update this PR with more details if I can get it working better. |
sjahr
left a comment
There was a problem hiding this comment.
Let's merge this fix and open a follow-up issue 👍
| { | ||
| // If we already have an instance, use that one | ||
| // If we already have an instance, reuse it's planning data | ||
| // FIXME: make reusing PlannerPtr not crash, so that we don't have to reconstruct a PlannerPtr instance |
There was a problem hiding this comment.
Can you open a follow-up issue for this? With a summery of you findings and (if possible) a path forward so somebody else can take this up in the future?
sjahr
left a comment
There was a problem hiding this comment.
Let's merge this fix and open a follow-up issue 👍
|
@sjahr can we back port this to Humble? I don't have permissions to ask mergify. |
* Fix Constraint Planning Segfault * Reuse planner data * apply clang formatting * apply clang formatting round 2 * add FIXME note and verbose output of planning graph size --------- Co-authored-by: Sebastian Jahr <sebastian.jahr@picknik.ai> (cherry picked from commit 92a7951)
* Fix Constraint Planning Segfault * Reuse planner data * apply clang formatting * apply clang formatting round 2 * add FIXME note and verbose output of planning graph size --------- Co-authored-by: Sebastian Jahr <sebastian.jahr@picknik.ai> (cherry picked from commit 92a7951) Co-authored-by: Marq Rasmussen <marq.razz@gmail.com>
* Fix Constraint Planning Segfault * Reuse planner data * apply clang formatting * apply clang formatting round 2 * add FIXME note and verbose output of planning graph size --------- Co-authored-by: Sebastian Jahr <sebastian.jahr@picknik.ai>
Description
This PR is an attempt to understand and hopefully fix a segfault I have been running into when asking MoveIt to solve constrained motion as reported in #2103.
While debugging we have found that the
ompl::base::PlannerPtrgenerated on the first planning request is going out of scope causing the next planning request to fault the program. As you can see in my initial commit I have removed the ability for the planning allocator to reuse the planner between requests which fixes the segfault. This is not ideal if the user is attempting to build a roadmap database because it would only contain the map of a single plan if written to disk, or thrown away on the next request. The planner is created on this line by passing it theSpaceInformationPtrthat is created here. If I'm understanding OMPL correctly thisSpaceInformationPtris being moved to the Planner object here and theConfiguredPlannerAllocatorcontaining it should be kept alive/in context by the known_planners_ object of the Planning Context Manager.I would really appreciate any help in identifying why the pointer, or objects inside of it, is going out of scope of if my understanding of the program is incorrect! As I have been learning this section of the software I have also wondered if the current implementation is worth fixing / there is a better way to preserve the planning scene between motion requests. Because the planner is constructed with the initial planning request goal's path_constraint does this mean that if it is reused it could have invalid constraints specified if they change between requests?