Skip to content

Design the World Model #565

@orduno

Description

@orduno

Background

What is the world model?

The robot's image or mental model of the world.

Jay Wright Forrested defined a mental model as:

The image of the world around us, which we carry in our head, is just a model. Nobody in his head imagines all the world, government or country. He has only selected concepts, and relationships between them, and uses those to represent the real system.

Navigation decisions and actions are made based on this model.

As shown below, the world model is populated with information coming from sensing, perception, and mapping; and supplies information to the navigation sub-modules.

overview

In order to guide our design decisions on the World Model, let's take a closer look at what are the various kinds of inputs and outputs.

Inputs to World Model

Let's consider the modules that provide information to the world model

Perception

The perception module provides input to the world model mainly to account for changes in the environment from both moving objects but also stationary objects with dynamic attributes, i.e. a traffic light.

Currently, moving objects are mostly accounted for by the obstacle layer of costmap_2d which process the raw output of a laser scanner.

Design Improvements

  • Ideally, we would encapsulate all sensor processing in the perception module.
  • World model would in some cases only receive higher-level descriptions of objects.

Maps & Map Server

The map server provides a priori information of the environment, mostly of stationary objects, in the form of a map. Maps can also contain dynamic information about some of these objects, i.e. traffic, road closures, etc.

Currently, the map server is only capable of processing and providing grid/cell-based (metric) types of map representations.

Design Improvements

  • Support additional map representations:
    • Topological
      • Multiple graphs for a world
      • Robot can travel between nodes (assumption)
    • (Virtual) Road network (for lane-based nav)
    • Vector-based
    • K-d Tree (Quad/Oct trees)
    • Hybrid
  • Enable update of dynamic information
  • Enable servicing from multiple maps (stitching multiple maps)
  • Enable servicing sub-sections of maps

Related issues: #18

Open Questions

  • Map representation is to some extent planner-dependent, therefore compatible map representations and planners might need to be selected at launch time.
  • For incompatible types, do we support transformation between types?
  • Some generalization might be possible if we consider maps as graphs and use planners that search on graphs.

Outputs from World Model

Let's consider the consumers of the information contained in the world model, aka the clients.

Clients operate on different length-scales and use different layers or aspects of the world model.

(Global) Path Planning

Can operate on a road network, topology map, or global map (sub-sampled occupancy grid or k-d tree).

These are coupled with the map representation being used.

Currently, only planners that operate on a costmap are supported.

Design Improvements

  • Extend interface to support a wider range of planning paradigms. Here are some general families of planners:
    • Potential Field
    • Graph Search (Families of A*, D* and E*, Wavefront)
      • At the lowest level, these use the world representation for obtaining the reachable nodes given the current, for collision checking and getting the cost of a cell (more on this below).
    • Combinatorial / Roadmaps generation
      • Visibility graph
      • Voronoi diagrams
      • Exact & Approximate Cell Decomposition
    • Sampling-based
      • Deterministic / Lattice-based (Hybrid-A*, ARA*, LARA*)
      • Random (R*, PRM, RRT, EST, and variants)
    • Optimal Control (LQR/G, iLQR, MPC)

Open Questions

  • For 2D navigation of a mobile platform, we could consider Combinatorial methods as graph construction techniques which output is then used by a graph search algorithm. We might want to consider this as a capability of the World Model, i.e. produce different types of graphs.
  • Adding the cost to a cell in the world model is used to encode some type of information, for example, to avoid possible collision given robot footprint or define a region preference. We might, however, want to handle this differently.
  • See Map Server section for some other questions related to the planner.

(Local Path Planning) Obstacle Avoidance and Control

Operates on a higher resolution local map representation, for example, an occupancy grid.

Attempts to follow the global path while correcting for obstacles in a dynamic environment. Provides the control inputs to the robot.

These are planner-dependent.

Currently, nav2 provides a DWA-based controller, nav2_dwb_controller. This has its own internal representation of the world (nav2_costmap_2d) with direct access to raw sensor data.

Design Improvements

  • Consolidate the world model. Create an interface in the World Model to support DWA.
    • DWA uses the world model to check if a proposed trajectory is free of a collision.
  • Extend interface to support other local planners (Add support for additional local planner algorithms #202).
    • 5D Planning -- runs a graph search planner (A*) under the hood.
    • Techniques used in Control Theory can be used to generate and follow local trajectories, i.e. Optimal Control (LQR/G, MPC).

Open Questions

  • TBD

Motion Primitives & Recovery

Currently, motion primitives do not interact with the world model. A pull-request (#516) is open that would add collision checking.

In ROS1 recovery, both the global and local costmap based representations were passed to the world model.

Design Improvements

  • Define the types of services needed by motion primitives, only for collision checking?
  • Define the types of services needed to support recovery, currently only identified clearing the structure (Add clear costmaps recovery #406).

Design

Goal

Design a world / environmental model for 2D navigation.

Objectives:

Summarizing the design improvements discussed above:

  • Support different levels of navigation, i.e. unstructured (no specific rules or reference path / lane), structured (predefined rules), etc.
  • Add support for different types of planners and controllers.
  • Consolidate the world model. Define a clean interface. Avoid code replication.
  • Improve the integration of perception pipelines.
  • Create a design that can be extended to a multi-robot scenario with distributed parts of the world-model.

Proposal

Given the extension of the change, we'll have to implement the design in multiple phases.

In the first phase, we can separate the world model from the clients and make them separate nodes.

phase0

In the second phase, we can define the new modules and port the current costmap based world model. Below is a high-level diagram, the components are explained below. The main point of this phase is to remove the dependency between the core representation and the type of client, we do this by defining some plugins that translate the information of the Core into something useful to the client. Similarly, we also define plugins for the inputs.

proposal_overview

On the following phases, we can extend this by introducing other map formats (beyond grid-based maps) and perception pipelines. We also support multiple internal representations. Eventually, we might have something like this:

goal

Core Representation

The core representation is module rich enough to represent the world with enough expressiveness for at least doing navigation. In an ideal case, this could be an internal simulator where we can ask anything about the world. By querying this internal simulator, we can build a structure needed by a navigation sub-module.

We might want to experiment with different types of core representations with different levels of expressiveness. We can initially use costmaps but eventually move to scene-graphs that support a semantic interface.

Additionally, multiple representations might be appropriate i.e. Robot-centric, World-centric.

Open Questions

  • Single vs Multiple Representations.
    • Robot-centric vs World-centric, Map/Client-dependant: costmap, topology, object-based, etc.
  • Support object-based representations? Such as ED or BRICS.
  • What other core representations are we interested in eventually supporting? Scene graph, costmap, etc.
  • Future support for probabilistic modeling?

Planner Plugin

The planner plugin extracts information from the core representation to create the structure needed by the planner.

Open Questions

  • We need to address the map-planner dependency issue. If the map server provides the map representation on a given format (i.e. elevation map) but the planner uses costmap, what component does the conversion?
  • Also, we might have different graph generation/representations, perhaps a graph-search planner could be generalized to work independently of the graph. However, there is a dependency on the robot dynamics e.g. when generating a lattice:
    • 2D Grid
    • Lattice (from motion discretization i.e. primitives)
    • Cell Decomposition
    • Topology (connectivity of places)
  • Topological maps. What are the use cases? As a replacement to metric map or addition? Annotated with information on how to navigate from one place to another? Is the navigation goal defined as a node on this map?

Control / Collision Avoidance Plugin

The control plugin extracts information from the core to create a useful structure for a controller/local planner.

Open Questions

  • What is the interface between the world model client and the controller?

Map to Core Plugin

Gets the map from the server and populates the core.

Open Questions

  • Map server to update in runtime the dynamic aspects of the map? Traffic given some external traffic monitoring system.
  • Perception could also update dynamic aspects?
  • (other questions?)

Sensing to Core Plugin

Gets low level (sensor data streams) or high level (objects with meta-data) and populates / updates the core.

Open Questions

  • (questions?)

Performance

Concerns

  • (add here)

Next Steps

Phase 0:

  • Extract the world model from the clients and encapsulate into separate nodes. World models will be client dependent. No new additional functionality.

Phase 1: Grid-based core using costmap_2d.

  • Stub out the plugins, core, and client:
    • Map-to-Core
    • Sensing-to-Core
    • (Global) Planner plugin
    • (Local Planner) Controller plugin
    • Motion primitives plugin
    • Core and layers
    • Client
  • Define the core using a costmap_2d.
  • Sensing-to-Core plugin to use laser scan and update obstacle layer in costmap_2d.
  • Map-to-Core to use map to fill static layer in costmap_2d.
  • Global planner plugin to get master layer of costmap_2d used by navfn_planner.
    • Client to support collision checking and providing cell cost.
    • Update navfn_planner to use new interface.
  • Controller pluggin to get master layer.
    • Client to support checking if trajectory is navigable.
    • Update dwb to use new interface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions