Skip to content

topic statistics's message age include callback duration. #1491

@hsgwa

Description

@hsgwa

Bug report

  • Operating System:
    • ubuntu 20.04
  • Installation type:
    • from source
  • Version or commit hash:
    • foxy branch
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

  1. add 100ms sleep to subscribe callback of topic_statistics_demo.
    Here is source code diff. This code calculates message age in callback as well. see message_age_ms.
  2. build and run demo
$ cd ~/ros2_ws/src
$ git clone -b topic_statistics_issue_reproduce git@github.com:hsgwa/demos.git
$ cd ~/ros2_ws
$ colcon build --symlink-install --packages-select topic_statistics_demo
$ ./build/topic_statistics_demo/display_topic_statistics imu

Expected behavior

[INFO] [1607690935.034977261] [imu_listener]: Listener heard: 1607690935 34339997. message_age_ms: 0.630387
[INFO] [1607690936.035295801] [imu_listener]: Listener heard: 1607690936 34443324. message_age_ms: 0.841733
[INFO] [1607690937.035266102] [imu_listener]: Listener heard: 1607690937 34452405. message_age_ms: 0.805826
[INFO] [1607690938.035036301] [imu_listener]: Listener heard: 1607690938 34468921. message_age_ms: 0.558657
[INFO] [1607690939.035153578] [imu_listener]: Listener heard: 1607690939 34527148. message_age_ms: 0.617793
[INFO] [1607690939.136186369] [statistics_listener]: Statistics heard:
Metric name: message_age source: imu_listener unit: ms
Window start: 135275560 end: 135798117
avg: 0.0  # message age without callback duration.
max: 0.0  # Their values should be almost message_age_ms which is calculated in sub callback.
min: 0.0   # current implementation remove floating point because of chrono::duration_cast
sample_count: 30.000000
std_dev: 0.471405
[INFO] [1607690939.136469027] [statistics_listener]: Statistics heard:
Metric name: message_period source: imu_listener unit: ms
Window start: 135275560 end: 135798117
avg: 999.655172
max: 1000.000000
min: 999.000000
sample_count: 29.000000
std_dev: 0.475312

Actual behavior

[INFO] [1607690935.034977261] [imu_listener]: Listener heard: 1607690935 34339997. message_age_ms: 0.630387
[INFO] [1607690936.035295801] [imu_listener]: Listener heard: 1607690936 34443324. message_age_ms: 0.841733
[INFO] [1607690937.035266102] [imu_listener]: Listener heard: 1607690937 34452405. message_age_ms: 0.805826
[INFO] [1607690938.035036301] [imu_listener]: Listener heard: 1607690938 34468921. message_age_ms: 0.558657
[INFO] [1607690939.035153578] [imu_listener]: Listener heard: 1607690939 34527148. message_age_ms: 0.617793
[INFO] [1607690939.136186369] [statistics_listener]: Statistics heard:
Metric name: message_age source: imu_listener unit: ms
Window start: 135275560 end: 135798117
avg: 100.666667 # message age include callback duration which is 100ms sleep in this sample case.
max: 101.000000
min: 100.000000
sample_count: 30.000000
std_dev: 0.471405
[INFO] [1607690939.136469027] [statistics_listener]: Statistics heard:
Metric name: message_period source: imu_listener unit: ms
Window start: 135275560 end: 135798117
avg: 999.655172
max: 1000.000000
min: 999.000000
sample_count: 29.000000
std_dev: 0.475312

Additional information

Actual behavior seems to be able to achieve by changing order of callback dispatch and now().
https://github.com/ros2/rclcpp/blob/master/rclcpp/include/rclcpp/subscription.hpp#L280

Why do we want to exclude the callback execution time from the message age?
Because if the performance is not as expected, we need to separate the bottleneck between communication and callback execution time.

However, one point to note is that subscribe callbacks delay due to executor's scheduling priority.
If nodes have both subscribe callbacks and timer callbacks, message age possibly include all timer callback duration.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions