-
Notifications
You must be signed in to change notification settings - Fork 196
Description
Background
In Casini et al, 2019, it is shown that execution of callbacks can happen in a different order than messages coming in. See this We also have an example for proof. This is usually not what is expected, and it happens only by accident, too.
Goal
We want to execute messages in importance order, which -- in the absence of other priorities -- is usually message arrival order.
Problem
To execute in message-arrival order, the executor needs ordering information. This is currently not possible in the rmw API, because the rmw_wait_set only has binary information: Either data is available on a topic or not. We don't know how old it is. Moreover, a topic may have more than one message waiting, where some or all may be older than a message for a different topic. These other messages will currently be skipped until the next invocation of rmw_wait.
Options
I see two general approaches to address this:
- We ask the middleware for timestamps on all waiting messages and perform ordering on the executor level.
- We ask the middleware "which object(s), of the ones in the wait_set, should we handle next?" where "next" is typically decided by "has the oldest data".
Q: Can anybody think of different options?
Discussion
Option 1) keeps the current rmw design, but adds more data. This appears more straightforward at first, but since there may be multiple messages waiting for each object, the data size is unpredictable. Also, it is not trivial to obtain this information from the middleware. The naive implementation has to deserialize all messages to get the SampleInfo. Alternatively, we could keep a listener attached at all times, and use it to determine arrival time. Or, we could modify the middleware to maintain this information for us without having to deserialize.
Option 2) either changes rmw_wait, or adds a new function with these new semantics. This will likely require more modifications in the rmw-implementations, but it would likely provide better options for the rmw-implementations to optimize obtaining this information. It would also limit the data-size, and could even make use of QoS information on the middleware layer.