Skip to content

Disabled result caching: Result gets erased before result callback is registered #2101

@benthie

Description

@benthie

Bug report

Required Info:

  • Operating System:
    • Ubuntu 20.04
    • Ubuntu 22.04
  • Installation type:
    • binaries
  • Version or commit hash:
    • galactic, foxy, humble
  • DDS implementation:
    • CycloneDDS
  • Client library (if applicable):
    • rclcpp_action

Steps to reproduce issue

I used the Fibonacci ActionServer (C++) and ActionClient (Python) and extended them as follows to reproduce the bug.

In the server I passed in options that disable the result caching and reduced the execute() function to return quicker.

#include <functional>
#include <memory>
#include <thread>

#include "action_tutorials_interfaces/action/fibonacci.hpp"
#include "rclcpp/rclcpp.hpp"
#include "rclcpp_action/rclcpp_action.hpp"
#include "rclcpp_components/register_node_macro.hpp"

#include "action_tutorials_cpp/visibility_control.h"

namespace action_tutorials_cpp
{
class FibonacciActionServer : public rclcpp::Node
{
public:
  using Fibonacci = action_tutorials_interfaces::action::Fibonacci;
  using GoalHandleFibonacci = rclcpp_action::ServerGoalHandle<Fibonacci>;

  ACTION_TUTORIALS_CPP_PUBLIC
  explicit FibonacciActionServer(const rclcpp::NodeOptions & options = rclcpp::NodeOptions())
  : Node("fibonacci_action_server", options)
  {
    using namespace std::placeholders;

    auto serverOptions = rcl_action_server_get_default_options();
    serverOptions.result_timeout.nanoseconds = 0;

    this->action_server_ = rclcpp_action::create_server<Fibonacci>(
      this,
      "fibonacci",
      std::bind(&FibonacciActionServer::handle_goal, this, _1, _2),
      std::bind(&FibonacciActionServer::handle_cancel, this, _1),
      std::bind(&FibonacciActionServer::handle_accepted, this, _1),
      serverOptions);
  }

private:
  rclcpp_action::Server<Fibonacci>::SharedPtr action_server_;

  rclcpp_action::GoalResponse handle_goal(
    const rclcpp_action::GoalUUID & uuid,
    std::shared_ptr<const Fibonacci::Goal> goal)
  {
    RCLCPP_INFO(this->get_logger(), "Received goal request with order %d", goal->order);
    (void)uuid;
    return rclcpp_action::GoalResponse::ACCEPT_AND_EXECUTE;
  }

  rclcpp_action::CancelResponse handle_cancel(
    const std::shared_ptr<GoalHandleFibonacci> goal_handle)
  {
    RCLCPP_INFO(this->get_logger(), "Received request to cancel goal");
    (void)goal_handle;
    return rclcpp_action::CancelResponse::ACCEPT;
  }

  void handle_accepted(const std::shared_ptr<GoalHandleFibonacci> goal_handle)
  {
    using namespace std::placeholders;
    // this needs to return quickly to avoid blocking the executor, so spin up a new thread
    std::thread{std::bind(&FibonacciActionServer::execute, this, _1), goal_handle}.detach();
  }

  void execute(const std::shared_ptr<GoalHandleFibonacci> goal_handle)
  {
    RCLCPP_INFO(this->get_logger(), "Executing goal");
    const auto goal = goal_handle->get_goal();
    auto result = std::make_shared<Fibonacci::Result>();
    result->sequence.push_back(42);
    goal_handle->succeed(result);
    RCLCPP_INFO(this->get_logger(), "Executing done");
  }
};  // class FibonacciActionServer

}  // namespace action_tutorials_cpp

RCLCPP_COMPONENTS_REGISTER_NODE(action_tutorials_cpp::FibonacciActionServer)

In the client I added some more logging to both goal_response_callback and get_result_callback.

import rclpy
from rclpy.action import ActionClient
from rclpy.node import Node

from action_tutorials_interfaces.action import Fibonacci


class FibonacciActionClient(Node):

    def __init__(self):
        super().__init__('fibonacci_action_client')
        self._action_client = ActionClient(self, Fibonacci, 'fibonacci')

    def send_goal(self, order):
        goal_msg = Fibonacci.Goal()
        goal_msg.order = order

        self._action_client.wait_for_server()

        self._send_goal_future = self._action_client.send_goal_async(goal_msg)
        self._send_goal_future.add_done_callback(self.goal_response_callback)

    def goal_response_callback(self, future):
        goal_handle = future.result()
        if not goal_handle.accepted:
            self.get_logger().info('Goal rejected :(')
            return

        self.get_logger().info('Goal accepted :)')

        self.get_logger().info("goal_response_callback: get_result_async")
        self._get_result_future = goal_handle.get_result_async()
        self.get_logger().info("goal_response_callback: add_done_callback(result_callback)")
        self._get_result_future.add_done_callback(self.get_result_callback)
        self.get_logger().info("goal_response_callback: done")

    def get_result_callback(self, future):
        result = future.result()
        self.get_logger().info('Result: status {}'.format(result.status))
        self.get_logger().info('Result: {0}'.format(result.result.sequence))
        rclpy.shutdown()


def main(args=None):
    rclpy.init(args=args)

    action_client = FibonacciActionClient()

    action_client.send_goal(10)

    rclpy.spin(action_client)


if __name__ == '__main__':
    main()

Expected behavior

According to the ROS2 ActionServer design article I'd expect the server to discard the results "after responding to any pending result requests".
The client should log "result_callback: status 4", which is STATUS_SUCCEEDED.

Actual behavior

The server discards the result as soon as it detects it as expired (see server.cpp L500). In some cases the discarding happens before the client has registered the result callback and thus the result is lost and at the time the result_callback is executed the client logs "result_callback: status 0", which is STATUS_UNKNOWN.

Additional information

I also put the rclcpp_action code in my overlay and added and changed some debug messages to log_level WARN so that I could see when the result gets discarded and when STATUS_UNKNOWN is returned by the server. Here is the log in which one can see the failure.

------------------------------------------------------------------------------------------------------------------------
Client
------------------------------------------------------------------------------------------------------------------------
[INFO] [1676025199.686814735] [fibonacci_action_client]: Goal accepted :)
[INFO] [1676025199.687044809] [fibonacci_action_client]: goal_response_callback: get_result_async
[INFO] [1676025199.687553497] [fibonacci_action_client]: goal_response_callback: add_done_callback(result_callback)
[INFO] [1676025199.687765941] [fibonacci_action_client]: goal_response_callback: done
[INFO] [1676025199.688158300] [fibonacci_action_client]: Result: status 0
[INFO] [1676025199.693995159] [fibonacci_action_client]: Result: array('i')

------------------------------------------------------------------------------------------------------------------------
Server
------------------------------------------------------------------------------------------------------------------------
[INFO] [1676025199.677291824] [fibonacci_action_server]: Received goal request with order 10
[DEBUG] [1676025199.677356341] [rcl]: Sending service response
[WARN] [1676025199.677414186] [fibonacci_action_server.rclcpp_action]: Accepted goal 912881d7a81d4a17b8ac8d6e89d14b0
[DEBUG] [1676025199.677736839] [rcl]: Waiting without timeout
[DEBUG] [1676025199.677742997] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1676025199.677747903] [rcl]: Guard condition in wait set is ready
[DEBUG] [1676025199.677753235] [rcl]: Waiting without timeout
[DEBUG] [1676025199.677755049] [rcl]: Timeout calculated based on next scheduled timer: false
[INFO] [1676025199.677770234] [fibonacci_action_server]: Executing goal
[DEBUG] [1676025199.677945233] [rcl]: Timer successfully reset
[DEBUG] [1676025199.677950962] [rcl]: Updated timer period from '0ns' to '0ns'
[DEBUG] [1676025199.677958279] [rcl]: Waiting with timeout: 0s + 0ns
[DEBUG] [1676025199.677961209] [rcl]: Timeout calculated based on next scheduled timer: true
[INFO] [1676025199.678002260] [fibonacci_action_server]: Executing done
[DEBUG] [1676025199.679048075] [rcl]: Timer in wait set is ready
[DEBUG] [1676025199.679071412] [rcl]: Timer canceled
[WARN] [1676025199.679082202] [fibonacci_action_server.rclcpp_action]: Expired goal 912881d7a81d4a17b8ac8d6e89d14b0
[DEBUG] [1676025199.679193269] [rcl]: Timer canceled
[DEBUG] [1676025199.679202156] [rcl]: Waiting without timeout
[DEBUG] [1676025199.679204135] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1676025199.679207484] [rcl]: Guard condition in wait set is ready
[DEBUG] [1676025199.679211155] [rcl]: Waiting without timeout
[DEBUG] [1676025199.679212952] [rcl]: Timeout calculated based on next scheduled timer: false
[DEBUG] [1676025199.687154346] [rcl]: Service in wait set is ready
[DEBUG] [1676025199.687177283] [rcl]: Service server taking service request
[DEBUG] [1676025199.687183095] [rcl]: Service take request succeeded: true
[WARN] [1676025199.687225064] [fibonacci_action_server.rclcpp_action]: Goal does not exist 912881d7a81d4a17b8ac8d6e89d14b0
[WARN] [1676025199.687228270] [fibonacci_action_server.rclcpp_action]: Creating result response with STATUS_UNKNOWN

I think this problem only applies to the Python API, because in C++ the result callback is part of the SendGoalOptions? I understand that the server cannot predict that some Python ActionClient is going to request the result soon. And I am not sure whether this an actual bug or rather a design decision and known issue the client has to deal with. As far as I understand, the Python API does not offer the ActionClient a way to register interest in the result at any earlier point in time. If there should be a way to do so, I am happy to try it out.

We only saw the issue in our Python based tests were the server was executing a rather short execution callback.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions