Skip to content

rosbag2_transport test_record can segfault #138

@pbaughman

Description

@pbaughman

I observed this failure in CI, and again on my machine. I was able to get a stack trace from GDB by doing

terminal 1> colcon build --packages-up-to rosbag2_transport
terminal 1> source install/setup.bash
terminal 1> while build/rosbag2_transport/test_record; do :; done  # Loop until the test fails, or in this case hangs

terminal 2> sudo gdb -p 1885  # The PID of the running test. . 
(gdb) where

The stack trace is as follows:

#0  0x00007f9c6b11998d in pthread_join (threadid=140308771751680, thread_return=0x0) at pthread_join.c:90
#1  0x00007f9c6a0fcb97 in std::thread::join() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00000000004e7d23 in std::_Mem_fn_base<void (std::thread::*)(), true>::operator()<, void>(std::thread&) const (this=0x7ffeb637a6e8, __object=...) at /usr/include/c++/5/functional:583
#3  0x00000000004d4286 in std::_Mem_fn_base<void (std::thread::*)(), true>::operator()<std::thread, , void>(std::reference_wrapper<std::thread>) const (this=0x7ffeb637a6e8, __ref=...) at /usr/include/c++/5/functional:619
#4  0x00000000004bb211 in std::_Bind_simple<std::_Mem_fn<void (std::thread::*)()> (std::reference_wrapper<std::thread>)>::_M_invoke<0ul>(std::_Index_tuple<0ul>) (this=0x7ffeb637a6e0) at /usr/include/c++/5/functional:1531
#5  0x000000000049dc56 in std::_Bind_simple<std::_Mem_fn<void (std::thread::*)()> (std::reference_wrapper<std::thread>)>::operator()() (this=0x7ffeb637a6e0) at /usr/include/c++/5/functional:1520
#6  0x000000000047d5aa in std::__once_call_impl<std::_Bind_simple<std::_Mem_fn<void (std::thread::*)()> (std::reference_wrapper<std::thread>)> >() () at /usr/include/c++/5/mutex:706
#7  0x00007f9c6b11fa99 in __pthread_once_slow (once_control=0x41f8578, init_routine=0x44ed70 <__once_proxy@plt>) at pthread_once.c:116
#8  0x00000000004501d0 in __gthread_once (__once=0x41f8578, __func=0x44ed70 <__once_proxy@plt>) at /usr/include/x86_64-linux-gnu/c++/5/bits/gthr-default.h:699
#9  0x0000000000467da2 in std::call_once<void (std::thread::*)(), std::reference_wrapper<std::thread> >(std::once_flag&, void (std::thread::*&&)(), std::reference_wrapper<std::thread>&&) (__once=..., __f=<unknown type in /home/pete.baughman/ws/build/rosbag2_transport/test_record, CU 0x0, DIE 0xd4d17>)
    at /usr/include/c++/5/mutex:738
#10 0x0000000000460b90 in std::__future_base::_Async_state_commonV2::_M_join (this=0x41f8550) at /usr/include/c++/5/future:1638
#11 0x0000000000460af4 in std::__future_base::_Async_state_commonV2::_M_complete_async (this=0x41f8550) at /usr/include/c++/5/future:1636
#12 0x000000000045fd11 in std::__future_base::_State_baseV2::wait (this=0x41f8550) at /usr/include/c++/5/future:319
#13 0x0000000000466ecf in std::__basic_future<void>::_M_get_result (this=0x386f240) at /usr/include/c++/5/future:681
#14 0x000000000046096f in std::future<void>::get (this=0x386f240) at /usr/include/c++/5/future:846
#15 0x0000000000461158 in rosbag2_test_common::PublisherManager::run_publishers(std::function<unsigned long (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)>) (this=0x2415a08, count_function=...)
    at /home/pete.baughman/ws/install/include/rosbag2_test_common/publisher_manager.hpp:77
#16 0x0000000000463531 in RecordIntegrationTestFixture::run_publishers (this=0x2415910) at path/rosbag2_transport/test/rosbag2_transport/record_integration_fixture.hpp:68
#17 0x0000000000459308 in RecordIntegrationTestFixture_published_messages_from_multiple_topics_are_recorded_Test::TestBody (this=0x2415910) at path/rosbag2_transport/test/rosbag2_transport/test_record.cpp:46
#18 0x00000000005b4e3a in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void> (object=0x2415910, method=&virtual testing::Test::TestBody(), location=0x600c0b "the test body") at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:2447
#19 0x00000000005a8a3d in testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void> (object=0x2415910, method=&virtual testing::Test::TestBody(), location=0x600c0b "the test body") at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:2483
#20 0x000000000055e6ba in testing::Test::Run (this=0x2415910) at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:2523
#21 0x000000000055fc4a in testing::TestInfo::Run (this=0x2414220) at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:2703
#22 0x0000000000560978 in testing::TestCase::Run (this=0x2415330) at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:2825
#23 0x0000000000579e88 in testing::internal::UnitTestImpl::RunAllTests (this=0x2414f70) at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:5226
#24 0x00000000005b6a8c in testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x2414f70, method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x57995a <testing::internal::UnitTestImpl::RunAllTests()>, 
    location=0x601610 "auxiliary test code (environments or event listeners)") at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:2447
#25 0x00000000005aaa18 in testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool> (object=0x2414f70, method=(bool (testing::internal::UnitTestImpl::*)(testing::internal::UnitTestImpl * const)) 0x57995a <testing::internal::UnitTestImpl::RunAllTests()>, 
    location=0x601610 "auxiliary test code (environments or event listeners)") at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:2483
#26 0x00000000005770f1 in testing::UnitTest::Run (this=0x984680 <testing::UnitTest::GetInstance()::instance>) at /home/pete.baughman/ws/install/src/gtest_vendor/src/gtest.cc:4834
#27 0x000000000054cf86 in RUN_ALL_TESTS () at /home/pete.baughman/ws/install/src/gtest_vendor/include/gtest/gtest.h:2372
#28 0x000000000054ce55 in main (argc=1, argv=0x7ffeb637b2f8) at /home/pete.baughman/ws/install/src/gmock_vendor/src/gmock_main.cc:53

I've got the following threads:

(gdb) info threads                                                                                                                                         
  Id   Target Id         Frame                                                                                                                             
* 1    Thread 0x7f9c6c6caf40 (LWP 1885) "test_record" 0x00007f9c6b11998d in pthread_join (threadid=140308771751680, thread_return=0x0) at pthread_join.c:90
  2    Thread 0x7f9c64076700 (LWP 1886) "test_record" 0x00007f9c6b120827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0,            
    futex_word=0x7f9c6bf09780 <rclcpp::SignalHandler::signal_handler_sem_>) at ../sysdeps/unix/sysv/linux/futex-internal.h:205                             
  3    Thread 0x7f9c63875700 (LWP 1887) "test_record" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185       
  4    Thread 0x7f9c63074700 (LWP 1888) "test_record" 0x00007f9c69b6ba13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84                         
  5    Thread 0x7f9c62873700 (LWP 1889) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  6    Thread 0x7f9c62072700 (LWP 1890) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  7    Thread 0x7f9c61871700 (LWP 1891) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  8    Thread 0x7f9c61070700 (LWP 1892) "test_record" pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185       
  9    Thread 0x7f9c53fff700 (LWP 1893) "test_record" 0x00007f9c69b6ba13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84                         
  10   Thread 0x7f9c537fe700 (LWP 1894) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  11   Thread 0x7f9c52ffd700 (LWP 1895) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  12   Thread 0x7f9c527fc700 (LWP 1896) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  13   Thread 0x7f9c515f2700 (LWP 1898) "test_record" 0x00007f9c69b6ba13 in epoll_wait () at ../sysdeps/unix/syscall-template.S:84                         
  14   Thread 0x7f9c50df1700 (LWP 1899) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  15   Thread 0x7f9c2ffff700 (LWP 1900) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  16   Thread 0x7f9c2f7fe700 (LWP 1901) "test_record" 0x00007f9c6b12194d in recvmsg () at ../sysdeps/unix/syscall-template.S:84                            
  17   Thread 0x7f9c2e7fc700 (LWP 1903) "test_record" 0x00007f9c6b121c1d in nanosleep () at ../sysdeps/unix/syscall-template.S:84  

I've got version d630f8e right now, because we're not all the way up to date with dashing. I didn't see anything in the changelog or open issues that indicates this has been noticed before. I'm going to try to reproduce this on the nightly docker image next.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions