Hi all,
I've already described my issue here in detail: https://answers.ros.org/question/355644/possible-risk-of-a-deadlock-in-rostimer-impl/
But since I didn't get any reply, I would like to post it here as well since it may be a bug.
Please refer to my linked post above on ROS Answers for the details.
TL;DR: I am using async. spinners with multiple threads and a ROS timer. All my callbacks have to lock a specific mutex first before doing anything in the callbacks. But what happened was that my timerCallback() was invoked and the above mentioned mutex was already locked by another thread - perfectly fine so far. That other thread was calling timer.stop() and since my timerCallback() did not return (and cannot because it is still passively waiting for the mutex to be released), timer.stop() waited infinitely -> deadlock.
Im my opinion, this should not happen. I mean, when I issue a timer.stop() but the timerCallback() was already invoked slightly before, it should be allowed to continue without any locking. timer.stop(), once called, should prevent to invoke any new callbacks but shouldn't care about a callback / event that has been triggered already.
Please take a look and let me know what do you think!
This was happening on Ubuntu 20.04 LTS with ROS Noetic. Unfortunately, because this was a race condition happening only very rarely, I may not simply be able to reproduce it.
Thanks for taking a look!