Fix Constraint Planning Segfault by MarqRazz · Pull Request #2130 · moveit/moveit2

MarqRazz · 2023-04-21T20:11:54Z

Description

This PR is an attempt to understand and hopefully fix a segfault I have been running into when asking MoveIt to solve constrained motion as reported in #2103.

While debugging we have found that the ompl::base::PlannerPtr generated on the first planning request is going out of scope causing the next planning request to fault the program. As you can see in my initial commit I have removed the ability for the planning allocator to reuse the planner between requests which fixes the segfault. This is not ideal if the user is attempting to build a roadmap database because it would only contain the map of a single plan if written to disk, or thrown away on the next request. The planner is created on this line by passing it the SpaceInformationPtr that is created here. If I'm understanding OMPL correctly this SpaceInformationPtr is being moved to the Planner object here and the ConfiguredPlannerAllocator containing it should be kept alive/in context by the known_planners_ object of the Planning Context Manager.

I would really appreciate any help in identifying why the pointer, or objects inside of it, is going out of scope of if my understanding of the program is incorrect! As I have been learning this section of the software I have also wondered if the current implementation is worth fixing / there is a better way to preserve the planning scene between motion requests. Because the planner is constructed with the initial planning request goal's path_constraint does this mean that if it is reused it could have invalid constraints specified if they change between requests?

codecov · 2023-04-21T20:31:42Z

Codecov Report

Patch coverage has no change and project coverage change: +0.01 🎉

Comparison is base (237e918) 50.48% compared to head (0f852cf) 50.48%.

❗ Current head 0f852cf differs from pull request most recent head a1a5925. Consider uploading reports for the commit a1a5925 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2130      +/-   ##
==========================================
+ Coverage   50.48%   50.48%   +0.01%     
==========================================
  Files         387      387              
  Lines       31785    31790       +5     
==========================================
+ Hits        16044    16047       +3     
- Misses      15741    15743       +2

Impacted Files	Coverage Δ
...pl/ompl_interface/src/planning_context_manager.cpp	`64.60% <0.00%> (-1.28%)`	⬇️

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

AndyZe · 2023-04-25T14:18:22Z

moveit_planners/ompl/ompl_interface/src/planning_context_manager.cpp

+    // auto planner_map_it = planners_.find(new_name);
+    // if (planner_map_it != planners_.end())
+    // {
+    //   return planner_map_it->second;


Is ->second a planning scene? I wonder if it just needs to be locked

There's an example of locking here: https://github.com/ros-planning/moveit2/blob/f432978571816eecdef8cc24e55260661ac049cd/moveit_ros/planning/planning_scene_monitor/include/moveit/planning_scene_monitor/planning_scene_monitor.h#L634

->second is a ompl::base::PlannerPtr that is created on this line by passing it the SpaceInformationPtr that is created here and from my understanding does not include the planning_scene. It does use a robot_model which I think comes from that locked scene so that could be the issue!

We do create a locked planning_scene on each planning request here but it gets unlocked after the goal succeeds.

henningkayser · 2023-04-26T16:35:20Z

Please see my comment here #2103 (comment)

mamoll · 2023-05-04T21:43:38Z

If this fixes the crash, then this seems like a useful workaround that should be merged. Recreating a planner instance from planner data adds a bit of overhead, but shouldn't be too bad. I'd add a comment along these lines so that we don't forget why this is done:

// FIXME: make reusing PlannerPtr not crash, so that we don't have to reconstruct a PlannerPtr instance

henningkayser · 2023-05-08T13:15:14Z

moveit_planners/ompl/ompl_interface/src/planning_context_manager.cpp

    {
-      return planner_map_it->second;
+      ob::PlannerData data(si);
+      planner_map_it->second->getPlannerData(data);


I'm confused now. How can we access the planner data if the planner instance is invalid? Or are there actually any members inside the planner instance that go out of scope? This version looks like it should produce the same segfaults

If you look at the stack traces I uploaded to the issue it faults using the planning_context here which contains the planner and a bunch of other stuff. Some of its objects go out of scope the but the planner data does not.

planning_context_->getOMPLStateSpace()->copyToRobotState(*robot_state, wrapped_state);

henningkayser

I'm fine with merging this as a temporary solution. The overhead of creating a new planner instance is really not too much of an issue, but we should keep on looking into the underlying cause anyway.

MarqRazz · 2023-05-08T16:06:34Z

Just so everyone knows, I have done some profiling and re-constructing the planner is taking a little under 0.2 seconds which is not ideal for application that are going for fast cycle times!

ompl::geometric::LazyPRM::LazyPRM(const base::PlannerData &data, bool starStrategy)
*** Profiling statistics. Total counted time : 0.181746 seconds
Blocks of time:
lazyPRMConstructor: 0.181721s (99.9865%), [0.181721s --> 0.181721 s], 1 parts, 0.181721 s on average
Unaccounted time : 2.4583e-05 (0.013526 %)

I have also found that re-using the planning graph does not speed up LazyPRM. If you give it the exact same problem to solve over and over again the graph will increase in size with each attempt but the planning time always uses the max allowed set by the user. This could be an error with my configuration so I will update this PR with more details if I can get it working better.

sjahr

Let's merge this fix and open a follow-up issue 👍

sjahr · 2023-05-10T11:25:21Z

moveit_planners/ompl/ompl_interface/src/planning_context_manager.cpp

  {
-    // If we already have an instance, use that one
+    // If we already have an instance, reuse it's planning data
+    // FIXME: make reusing PlannerPtr not crash, so that we don't have to reconstruct a PlannerPtr instance


Can you open a follow-up issue for this? With a summery of you findings and (if possible) a path forward so somebody else can take this up in the future?

sjahr

Let's merge this fix and open a follow-up issue 👍

MarqRazz · 2023-05-11T15:00:07Z

@sjahr can we back port this to Humble? I don't have permissions to ask mergify.

* Fix Constraint Planning Segfault * Reuse planner data * apply clang formatting * apply clang formatting round 2 * add FIXME note and verbose output of planning graph size --------- Co-authored-by: Sebastian Jahr <sebastian.jahr@picknik.ai> (cherry picked from commit 92a7951)

* Fix Constraint Planning Segfault * Reuse planner data * apply clang formatting * apply clang formatting round 2 * add FIXME note and verbose output of planning graph size --------- Co-authored-by: Sebastian Jahr <sebastian.jahr@picknik.ai> (cherry picked from commit 92a7951) Co-authored-by: Marq Rasmussen <marq.razz@gmail.com>

* Fix Constraint Planning Segfault * Reuse planner data * apply clang formatting * apply clang formatting round 2 * add FIXME note and verbose output of planning graph size --------- Co-authored-by: Sebastian Jahr <sebastian.jahr@picknik.ai>

Fix Constraint Planning Segfault

b19e25a

sjahr self-assigned this Apr 24, 2023

henningkayser self-requested a review April 25, 2023 14:15

AndyZe reviewed Apr 25, 2023

View reviewed changes

MarqRazz mentioned this pull request Apr 26, 2023

Constraint Planning Segfault #2103

Open

MarqRazz added 3 commits May 4, 2023 12:14

Reuse planner data

98f6a59

apply clang formatting

fdea4b3

apply clang formatting round 2

6392379

add FIXME note and verbose output of planning graph size

30c25ce

henningkayser reviewed May 8, 2023

View reviewed changes

henningkayser approved these changes May 8, 2023

View reviewed changes

sjahr approved these changes May 10, 2023

View reviewed changes

Merge branch 'main' into pr-fix_constraint_segfault

0f852cf

MarqRazz mentioned this pull request May 10, 2023

Enable re-using planner to speed up loading when multi_query_planning_enabled is true #2169

Closed

Merge branch 'main' into pr-fix_constraint_segfault

a1a5925

sjahr enabled auto-merge (squash) May 11, 2023 07:38

sjahr merged commit 92a7951 into moveit:main May 11, 2023

MarqRazz deleted the pr-fix_constraint_segfault branch May 11, 2023 14:38

sjahr added the backport-humble Mergify label that triggers a PR backport to Humble label May 12, 2023

mergify bot mentioned this pull request May 12, 2023

Fix Constraint Planning Segfault (backport #2130) #2173

Merged

Conversation

MarqRazz commented Apr 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

codecov bot commented Apr 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

AndyZe Apr 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MarqRazz Apr 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

henningkayser commented Apr 26, 2023

Uh oh!

mamoll commented May 4, 2023

Uh oh!

henningkayser May 8, 2023

Choose a reason for hiding this comment

Uh oh!

MarqRazz May 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

henningkayser left a comment

Choose a reason for hiding this comment

Uh oh!

MarqRazz commented May 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sjahr left a comment

Choose a reason for hiding this comment

Uh oh!

sjahr May 10, 2023

Choose a reason for hiding this comment

Uh oh!

sjahr left a comment

Choose a reason for hiding this comment

Uh oh!

MarqRazz commented May 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

MarqRazz commented Apr 21, 2023 •

edited

Loading

codecov bot commented Apr 21, 2023 •

edited

Loading

AndyZe Apr 25, 2023 •

edited

Loading

MarqRazz Apr 25, 2023 •

edited

Loading

MarqRazz May 8, 2023 •

edited

Loading

MarqRazz commented May 8, 2023 •

edited

Loading

MarqRazz commented May 11, 2023 •

edited

Loading