Skip to content

FIx wrong path segment selection in RPP and DWB#2790

Closed
sathak93 wants to merge 11 commits intoros-navigation:mainfrom
sathak93:path_segment_pr
Closed

FIx wrong path segment selection in RPP and DWB#2790
sathak93 wants to merge 11 commits intoros-navigation:mainfrom
sathak93:path_segment_pr

Conversation

@sathak93
Copy link
Copy Markdown
Contributor


Basic Info

Info Please fill out this column
Ticket(s) this addresses #2622
Primary OS tested on (Ubuntu)
Robotic platform tested on gazebo simulation turtlebot)

Description of contribution in a few bullet points

Description of documentation updates required from your changes


Future work that may be required in bullet points

  • isGoalreached() in goal checker should be improvised.
  • fix this bug in TEB.

For Maintainers:

  • Check that any new parameters added are updated in navigation.ros.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Jan 27, 2022

@sathak93, please properly fill in PR template in the future. @SteveMacenski, use this instead.

  • Check that any new parameters added are updated in navigation.ros.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

@sathak93 sathak93 changed the title Path segment pr FIx wrong path segment selection in RPP and DWB Jan 27, 2022
@SteveMacenski
Copy link
Copy Markdown
Member

Should I take a look at this or wait until you work on the DWB part?

nav2_util::geometry_utils::first_element_beyond(
transformation_begin, global_plan_.poses.end(), transform_end_threshold);

for (auto it = transformation_begin + 1; it != transformation_end - 1; ++it) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite understanding why this was added. It looks like a dot product finding a directional change but not really a comment about why or what problem its solving

return euclidean_distance(robot_pose, ps);
});

// Find points definitely outside of the costmap so we won't transform them.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was the right implementation, since we want to find the first point that's outside of the statically set costmap size. This isn't an integral distance but a real-constraint distance dependent on the costmap size. This could potentially not prune quite to the end of the costmap (dropping necessary data for longer look-ahead points) for wavy paths. There's no real benefit to culling a few points more off the end of the rolling costmap window.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the transformed_plan is very large and has poses ahead and behind the robot as in overlapping case, the getlookahead point may select the point behind the robot making the robot leave some path and turns back immediately.
for this it is trimmed to a particular distance from transform begin using integrated distance.

if we need this to be remain unchanged, we can modify and get the lookahead point using integrated distance.

Copy link
Copy Markdown
Member

@SteveMacenski SteveMacenski Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK that makes sense, but then this value shouldn't be derivative of the costmap size as it currently is e.g.

  const double max_transform_dist = max_costmap_dim * costmap->getResolution() / 2.0;

This should probably now be a parameterized distance. max_path_dist seems like a good name.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that finding the closest point needs to have an upper bound, so that it doesn't select overlapping poses at a later part of the path. Also noting that we are implicitly relying on the lower bound given by pruning the global plan after receiving it, so we don't process the same segments twice.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @SteveMacenski that we shouldn't prune by lookahead distance: this part of the controller happens before the lookahead distance is computed, right?

I think that we should simply prune by the minimum of the cusp distance and the costmap size. That guarantees that you won't be able to get portions of the path that have looped around.

Copy link
Copy Markdown
Member

@SteveMacenski SteveMacenski Feb 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this happens before any of the lookahead logic is applied. This is just pruning the path forward to only consider things in a local window forward.

I don't think the lookahead_dist_ should be used here and could indeed create a problem. I think this should be a separate parameter or derived from another source.

@Aposhian I think that works. In L604-607 where we do the first_element_beyond I think the argument should change from lookahead_dist_ to the max rolling costmap size or cusp dist (and variable renamed appropriately) or a parameterizable distance. It would still have the same effect, just change the potential window to look for the nearest pose to the robot from point 0 -> costmap edge rather than point 0 -> lookahead distance which it may have passed from the last iteration if moving very fast.

double lookahead_dist = getLookAheadDistance(speed);

// Check for reverse driving
if (allow_reversing_) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove the allow_reversing function? That is necessary so that we only do this if its appropriate to be rotating in place due to a sudden change in pose, that shouldn't normally be turned on unless letting the robot move backwards is allowable by that parameterization

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SteveMacenski that is turned on always because this fn calculate dot product to find cusp and
https://github.com/ros-planning/navigation2/blob/5ca6c511120d4dd612d14fd944fca7eba5e19b83/nav2_regulated_pure_pursuit_controller/src/regulated_pure_pursuit_controller.cpp#L655
returns distance to cusp
if the lookahead dist is larger than cusp dist the robot will not reach the cusp.
like the goal pose, look ahead distance need to be progressively decreased when nearing the cusp.
still it won't affect the reversing behavior of the robot as this block https://github.com/ros-planning/navigation2/blob/5ca6c511120d4dd612d14fd944fca7eba5e19b83/nav2_regulated_pure_pursuit_controller/src/regulated_pure_pursuit_controller.cpp#L300
prevents the velocity to be negative in order to move backwards.

Copy link
Copy Markdown
Member

@SteveMacenski SteveMacenski Feb 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, makes sense thanks for the clarification. Lets add a comment there saying that this is used to obtain the cusp point for either path following or directional changes - mostly as a note to myself later when I re-read this

@sathak93
Copy link
Copy Markdown
Contributor Author

sathak93 commented Feb 2, 2022

This bug #2622 is due to either of

  1. min_by function used to pick transformation_begin pose - when the path overlaps, sometimes it picks the later poses than immediate ones which leads to deleting of poses. in this pr i changed the range from global plan begin to look_ahead point instead of global plan begin to global plan end so that the immediate ones will be always selected.

  2. find_if algorithm in getlookaheadpoint picks wrong pose if the transformed plan has some poses in front of the robot and some back to the robot. in this pr, look_ahead point in progressively reduced when nearing the cusp using the already defined function for reversing behavior. And the transformed_plan ending is selected using the integrated path distance instead of absolute distance

@sathak93
Copy link
Copy Markdown
Contributor Author

sathak93 commented Feb 2, 2022

find_if and min_by algorithm was the main cause of the this bug. this was not suitable to overlapping path case of navthrough poses behavior.
this is from my understanding of it. i am open to any improvement or suggestion of new algorithm

@SteveMacenski
Copy link
Copy Markdown
Member

@Aposhian please let us know your thoughts 😄

@Aposhian
Copy link
Copy Markdown
Contributor

Aposhian commented Feb 8, 2022

Here is the difference between the stock galactic controller, and building this PR for rolling. The green panels are the global plan (which is being updated by a custom planner), and the base_link TF can be seen cruising along, with the lookahead point being updated in red.
current-controller
n
When I try this PR, the robot approaches the initial lookahead point, and then stops.
new-controller

@SteveMacenski
Copy link
Copy Markdown
Member

I think this may relate to the fact you're giving it a hilariously small path, this is handling the looping for you so you shouldn't need to give it something so small. Though this is clearly an issue, are you sure that the controller is getting those path updates?

@sathak93 why would this be?

@sathak93
Copy link
Copy Markdown
Contributor Author

sathak93 commented Feb 8, 2022

@SteveMacenski i will check this pr for very small path. also i think it is best to validate RPP of this using main branch rolling. RPP in galactic branch and in rolling branch has many differences. using rolling we can focus problems related to this pr.

@SteveMacenski
Copy link
Copy Markdown
Member

I think @Aposhian did use rolling's version of RPP to see that issue. I think he was just highlighting the difference with a baseline (no?)

@Aposhian
Copy link
Copy Markdown
Contributor

Aposhian commented Feb 8, 2022

Yes, the short path that is being given has been our workaround to the bug that we are trying to fix here 😄, so that it something that we should hopefully be able to remove, but I don't think that this fix should cause a regression for short paths. I will test on nav2 main to compare as well, and will try removing our intermediary path segment selection to see how this fix addresses our problem.

@Aposhian
Copy link
Copy Markdown
Contributor

Aposhian commented Feb 8, 2022

Here is the performance against the main branch: it doesn't exhibit the stalling behavior that I am seeing with this PR.
nav2-main-controller

@Aposhian
Copy link
Copy Markdown
Contributor

Aposhian commented Feb 8, 2022

I have made some modifications on my own fork. But it has occured to me: if we implement an upper bound to prevent jumping to future points, then we are implicitly increasing our reliance on the lower bound provided by the global_plan pruning. What this means that if we make the change this way, it may break backward compatibility with some users' expectations by removing the controllers capability to "snap" to any point on the path, provided by the current behavior of just grabbing the closest pose to the robot.

"snapping" is fine and easy for paths that don't overlap, but with overlapping paths, you need to make sure that you always prioritize earlier portions of the path over later portions, and it may be hard to do that in a deterministic but also intuitive way.

There are a few ways I think we could solve this problem:

  • add a parameter to give users control on whether they want "snapping" or being able to do overlapping paths (nonideal)
    • this would essentially work by either starting the local plan at the first pose in the global plan (enforced lower bound) or using the current behavior of snapping to the absolute closest waypoint
  • instead of finding the absolute closest waypoint, maybe find the minimum (by index) waypoint within a predefined "search radius". This would support snapping (within reason)
    • this poses the following tradeoff: a greater search radius provides a larger snapping distance, but may not snap to the closest waypoint
  • use a heuristic of minimum waypoint index and distance between robot and pose to determine which waypoint to snap to.

@Aposhian
Copy link
Copy Markdown
Contributor

Aposhian commented Feb 8, 2022

Take the following global plan for example:
blue are poses in the global plan, and red is the robot pose.
snapping
e

Now, assuming that nav2 has been incrementally pruning the start of the global plan, in this situation, the controller should snap to pose 2. But how do you describe that as an algorithm? Closest pose won't work, since that would snap to pose 10. If you do a simple search radius paired with minimum waypoint id, then if search_radius = 4, then it will snap to 2. But if search_radius = 5, then by the same algorithm, it would snap to pose 1, which is not ideal.

@Aposhian
Copy link
Copy Markdown
Contributor

Aposhian commented Feb 8, 2022

Another option would be be to get the closest waypoint between the start of global_plan and the 180 degree turnaround. That should snap to pose 2, and that should support many, but not all cases where people want to snap to paths that don't overlap.

@Aposhian
Copy link
Copy Markdown
Contributor

Aposhian commented Feb 8, 2022

@SteveMacenski I have made some modifications on my fork based off of @sathak93 's work. Should I reflect those as suggested changes in GitHub here? Or should I create another PR?

@SteveMacenski
Copy link
Copy Markdown
Member

SteveMacenski commented Feb 11, 2022

I'm not entirely sure I'm following you.

if we implement an upper bound to prevent jumping to future points, then we are implicitly increasing our reliance on the lower bound provided by the global_plan pruning

Correct, but that should be pretty good, no? I haven't run into any issues w.r.t. this. We're recursively updating quickly, so there should be little change between iterations since its being run at 10-1000 hz. So as long as we can detect the robot starting pose effectively (what the role of this PR is to improve), the pruning should be spot on. I think that's safe to rely on, and in fact we already do so there's really not much change w.r.t. that.

What this means that if we make the change this way, it may break backward compatibility with some users' expectations by removing the controllers capability to "snap" to any point on the path, provided by the current behavior of just grabbing the closest pose to the robot.

This is where maybe your wording is confusing me. It should always select the path point that it was tracking in time = t-1 plus 1-N waypoint markers depending on the speed and density of the path. It should be going sequentially.

If I'm understanding what you mean by "snapping", you mean that it jumps to another part on the path that is technically closer, but then skips some parts of the path. That is a bug, not desired behavior (and what we're trying to fix here). I don't think anyone expects that behavior for us to want to retain it for backward compatibility. Certainly that is not a feature that we advertise or I would ever think to be desirable. Especially for a pure pursuit who's goal is to purely follow.

The logic as written's goal is to find the closest point on the path that the robot is, for which it assumes perfect path tracking capability. If the path tracking is bad so that its actually closer to another part of the zig-zag, I don't think you want it to select a point that is significantly further along the route than you were in just 1 iteration prior based on that error.

always prioritize earlier portions of the path over later portions

maybe find the minimum (by index) waypoint within a predefined "search radius".

Isn't what this PR is doing? But instead of a radius search, its limiting the search for what path point is most close to the robot by changing the search condition from point 0 -> point len(path) to now point 0 -> point N where N is bounded by an integrated distance from point 0, which we're discussing how to find N in the comments above (probably based on the length of a side of the costmap or similar). So we're limiting the search by the integrated path distance from the point 0 which is recursively updated as essentially right behind the robot pose, essentially.

On your diagram: I think if you look at the PR, that's the point of L604-L611's changes. We're restricting the search via the local window because we know that we're running this at 1, 5, 10, 100 hz, so its frequently. Therefore we know we shouldn't be "very far" from the last iteration so we're recursively able to update knowing we're not "far" from the last update. There's no need for the radius based search because we can just prune to only look in a local region ahead which can be set by a parameter or costmap size.

The costmap size would be most useful, but that only works if you maneuvers aren't going to be squiggling multiple times within the 2x2 or 4x4 meter window. If you are making those small finite turns, we may want to instead make it based on a parameter so that it can be set lower. But for any normal system, you'd never expect to have something like what you're showing in your figure all bounded within the local costmap size. It would be a larger trend several times larger for rows / etc. But your diagram has no scale so I can't tell if you're talking about that squiggle being a 2x2m window or a 20x20m window.

For configurability, maybe its a good idea to make it a parameter, but inform users in the configuration guide that a good bet for it is the costmap side length, unless making very small maneuvers that might overlap like what you're showing.

Or should I create another PR?

Another PR would be good and point out what you changed so its easy for me to tell on review.

@Aposhian
Copy link
Copy Markdown
Contributor

Sorry if my comments were confusing, since I was bringing up a few different alternatives. I do agree that parameterized closest pose is better than using a search radius, which is not as robust. I think the bound should be up to the local costmap OR the nearest turnaround point.

This is where maybe your wording is confusing me. It should always select the path point that it was tracking in time = t-1 plus 1-N waypoint markers depending on the speed and density of the path. It should be going sequentially.

I suppose most users would not experience this "snapping" behavior since they use planners that always start the global plan from the current robot pose. Our planner doesn't necessarily do this, and this is why we have observed the "snapping" behavior since it just grabs the closest pose.

My modifications are here: #2812

@SteveMacenski
Copy link
Copy Markdown
Member

@sathak93 thank you kindly for jump starting this discussion and developing the core ideas to solve this problem. It was very clever and didn't occur to me and a few others so we really appreciate you jumping in and I hope to have your help again in the future!

I'm going to close this ticket since #2812 supersedes with a slight variation on your technique to work better with another user's experiences to make sure this can work for everyone's needs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RPP/DWA/TEB] When NavThroughPose has an intersecting loop, it can occasionally pick the wrong path segment

3 participants