Skip to content

Waitsets, listeners and masks#900

Merged
eboasson merged 13 commits intoeclipse-cyclonedds:masterfrom
eboasson:waitset-listener-interaction
Aug 19, 2021
Merged

Waitsets, listeners and masks#900
eboasson merged 13 commits intoeclipse-cyclonedds:masterfrom
eboasson:waitset-listener-interaction

Conversation

@eboasson
Copy link
Copy Markdown
Contributor

@eboasson eboasson commented Aug 4, 2021

This PR changes — "fixes" would be a nicer way of putting it, but there is always the possibility of mistakes ... — the way waitsets, listeners and status masks interact. Some of this met the spec but caused trouble nonetheless, some did not meet the spec but caused suprisingly little trouble.

  • The spec says that prior to listener invocation, the corresponding status is reset (preventing waitsets from triggering). Cyclone implemented this, but this caused complications for Add EventsExecutor ros2/rmw_cyclonedds#256. Sometimes it makes sense to reset the status, but there is definitely also value in supporting listeners without side-effects. This makes it configurable with the new dds_lset_..._arg functions.
  • The status mask did too much: it did indeed allow one to configure which events should trigger waitsets, but it did this by suppressing too much: neither status updates nor listener invocations took place for statuses disabled by the mask. This was clearly a bug, e.g., Status mask is not effective if it is previously set to none #891. (If you don't want a listener, don't install one; if you don't care about a status, don't look at it.)

Both referenced issues should be addressed by this PR.

In the process it also changes the implementation of the DATA_ON_READERS event because maintaining it in the subscriber in combination with the first point turned out to dramatically lower performance on an M1-based MBP (ddsperf -TOU in a throughput test dropping below 3MS/s is clearly unacceptable!). Now, for subscribers not attached to any waitset, it computes the DATA_ON_READERS status on demand from the subscriber's readers.

DATA_ON_READERS is supposed to be raised on along with DATA_AVAILABLE on some reader (of the subscriber), and supposed to be reset by the first operation on some reader (of the subscriber) that resets DATA_AVAILABLE. When it is attached to a waitset, this is what it does; when not attached to a waitset, it actually it returns DATA_ON_READERS if there is some reader that has DATA_AVAILABLE set.

Either behaviour seems usable for applications, and the change in behaviour seems acceptable considering that D_O_R only becomes really valuable for "group ordered" and "group coherent" data, which aren't supported yet anyway.

@dpotman
Copy link
Copy Markdown
Contributor

dpotman commented Aug 5, 2021

I agree with the changes in how the status mask works, the existing approach has always confused me somewhat. I've checked the code changes and couldn't find any issues (although I didn't look into the data-on-readers part in detail, as I don't have much experience in using this).

The name of the dds_lset_..._arg functions confused me a bit, because of the extra option 'reset', which I didn't expect based on the name (but of course it is not really a problem, because the function signature makes clear what it does).

PS: while testing the changes of this PR, I found another issue (#905) which I think is completely unrelated

@eboasson
Copy link
Copy Markdown
Contributor Author

@mauropasse I just realised that you may have missed this PR, which I think should solve the problems with the EventsExecutor on Cyclone. I would appreciate it if you could give it a try.

The default behaviour is unchanged, so to get the desired behaviour you need to change the way you configure the listeners. It is but a small change, e.g., instead of dds_lset_data_available(listener, dds_listener_callback), you'll need dds_lset_data_available_arg(listener, dds_listener_callback, &sub->user_callback_data, false): the false tells it not to reset the event on invoking the callback and as the argument pointer is now configurable per listener, you need to pass the correct listener argument in as well.

@mauropasse
Copy link
Copy Markdown

Hi @eboasson, thanks for your work!
I've tested your branch, now I have succesful test results both with EventsExecutor as well as the waitset based SingleThreadedExecutor.
I created an internal PR to use the new APIs: irobot-ros/rmw_cyclonedds#26
Still didn't have time to get through the inners of this PR, but great that it works!

Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
Used only in two places, with behaviour that is confusingly different in
its side-effects from dds_entity_status_set.

Signed-off-by: Erik Boasson <eb@ilities.com>
The documentation comment suggests it is allowed, though it said "NULL"
and didn't say what exactly that means, but in any case 0 was a rather
useless value as it would turn the operation into a no-op.

The wisdom of returning an error if the mask contains status flags not
relevant to the particular entity type is also doubtful, but if 0 is
allowed to simply mean "everything", that is not really much of an
issue.

And with that change, dds_get_status_changes becomes an alias for
dds_read_status with mask = 0.  Deprecating should be considered, as the
"get" is rather ambiguous and can just as easily be interpreted as an
innocent "getter" or as a function that also reset the status.

Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
Signed-off-by: Erik Boasson <eb@ilities.com>
The vast majority of the DDS applications ignores DATA_ON_READERS, as it
really is only useful for group ordered/coherent subscribers, and those
weren't available in any DDS implementation for many years.  Cyclone
doesn't have them either.

At the same time, DATA_ON_READERS is relatively costly, because the spec
says it should always be raised when DATA_AVAILABLE is on any of the
subscriber's readers, and reset when read/take is called on any of the
same.  That roughly means every single time a reader history cache is
updated, the subscriber status has to be manipulated as well.

It turns out that this has a significant performance impact once one
allows the statuses to remain set even when invoking the listener, so
that one can have listeners that monitor without any side-effects.
(Significant here is 3.02MS/s -> 2.95MS/s in ddsperf -TOU pub/sub test
on my laptop, and the significane is perhaps more in sinking below 3MS/s
than in any absolute terms!)

This commit addresses that be only materializing the DATA_ON_READERS
state when the subscriber is attached to a waitset, and returning an
approximation of it when querying the subscriber's status.  The
workaround for getting a materialized DATA_ON_READERS is to create an
otherwise unused waitset with just the subscriber attached to it.

Signed-off-by: Erik Boasson <eb@ilities.com>
These extend on the existing setters/getters as follows:

* Provide a return value for returning BAD_PARAMETER if the "listener"
  argument is a null pointer.

* Allow overriding the listener argument pointer specified in the
  dds_create_listener call.  This is quite useful but the workaround for
  the absence of this argument was a rather convoluted procedure
  combining getting, setting and merging listeners.

* A new boolean argument whether (true) or not (false) the status
  corresponding to the listener should be reset prior to invoking the
  listener (or more precisely: never get set) as described in the DDS
  specification.  Setting it to false allows installing listeners
  without side-effects.

The old functions have been re-written in terms of the extended ones.

Signed-off-by: Erik Boasson <eb@ilities.com>
This provides the implementation of leaving the status flags set in the
entity when invoking a listener if that listener was set with the
"invoke_on_reset" flag clear (via the new, extended interface for
setting listeners), and resetting it as required by the spec otherwise.

Signed-off-by: Erik Boasson <eb@ilities.com>
This brings the behaviour of the status mask in line with the
specification, previously masking an event (i.e., clearing its bit in
the mask) would have two additional, impractical consequences:

* It would prevent a listener from being invoked.  If there is no need
  for a listener, don't set one.

* It would force the corresponding status to 0 (not signalled), making
  it impossible to poll for events outside the status mask.  E.g., with
  a waitset and a mask of DATA_AVAILABLE to wait for the arrival of
  data, it was impossible to observe any liveliness changed events on
  the same reader.

Signed-off-by: Erik Boasson <eb@ilities.com>
This adds a test to verify that invoking a DATA_AVAILABLE listener
configured not to auto-reset the status does indeed allow a waitset to
trigger and that the waitset (generally, there is a short window in time
during which the status can be observed) wakes up after the listener was
invoked.

It tests this both in the case of a listener that doesn't reset
DATA_AVAILABLE internally, and in the case of a listener that does reset
it by calling dds_take().

Signed-off-by: Erik Boasson <eb@ilities.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants