-
Notifications
You must be signed in to change notification settings - Fork 874
Description
Currently, the blocking behavior of rcl_publish is not well defined, see:
This was also brought up in the large message publishing pull request: ros2/rmw_connext#183
It would be good to define how publish should behave in different situations. The available options are something like:
- undefined, it may or may not block but which is unspecified and implementation specific
- blocking (synchronous), it blocks until some predicate is met
- non-blocking (asynchronous), it stores the message to be published and returns immediately
- actual delivery of the message is done in a different thread
- blocking, but with a timeout
The blocking predicate could one of "until the data is sent", "until the data is received / acknowledged", or something else.
We can choose to either define the behavior or not, whereas not defining the behavior would basically be saying "check the implementation" but defining the behavior would potentially make it more difficult for some implementations to adhere to the interface. I'm personally in favor of specifying the behavior and forcing the rmw implementation to ensure the publish function behaves as expected.
Separately we can decide whether or not to allow control of the behavior. As I see it we could do one of these:
- do not choosing between sync/async publishing
- this is basically always using an "auto" mode for selecting the publishing mode
- it would give the implementation a lot of flexibility
- allow the user to select sync, async, or auto
- this would force the rmw implementation to support multiple modes
- this would be burdensome, for example, for implementations that only have sync publishing
- always use async publishing
- also hard for implementations that only have sync publishing (they would need to implement queueing and have a publishing thread)
- always use sync publishing
- this is the easiest, assuming the async implementations have a "wait for publish to finish" method (connext does, not sure about fastrtps)
My proposal would be to expose a publish blocking behavior option that is either "sync", "async", or "auto". Also, when using "auto", have a function to detect whether it will be "sync" or "async" and maybe a compile time check (static_assert) for the default auto behavior. Depending on how it interacts with message size the compile time check may not be possible, i.e. perhaps it only works with bounded message types.
For implementations that only have sync publishing, they will need to implement a message queue and thread to do the synchronous publishing. This is burdensome, but I don't see a better way. It would be possible that we could have a rmw_supports_async_publishing() -> bool function, and if the implementation returns false we could have a generic implementation in the client libraries to support this, maybe even in rcl, in order to reduce the burden. But since our current targets both support async publishing this isn't such a problem.
For implementations that only support async publishing (or require it for large messages), they would need to emulate synchronous publishing. I don't think that would be too hard, especially if like Connext they have a function to wait on the publishing to be complete. Otherwise some custom synchronization maybe be necessary.
For ideas on how the heuristic that controls whether "auto" selects "sync" or "async" could operate see: ros2/rmw_connext#190
I'd recommend also that "auto" be the default for this proposed option. That allows the implementation to select whichever one is most efficient based on the design of the implementation, the QoS settings, and the message type being published. However, it might also be prudent to select either sync or async as a default for consistency in our interface across implementations.
Some points that should be decided before implementing this:
- should there be a heuristic based "auto" option, or just "sync" / "async"?
- should the default be "auto", "sync", or "async"?
- should the option be allowed to change after creating the publisher?
- either seamlessly or by destroying the recreating the publisher automatically
- what should the predicate for unblocking a sync publish be?
- either data is sent or data is acknowledged or an option to select one or the other?