Skip to content

Thread unsafe signal handling in main_common_test #6083

@htuch

Description

@htuch

It seems that the libevent signal handler might be invoked on an arbitrary thread. This then cause issues with main_common_test when we enable TSAN on libevent in #6061. E.g. in AdminRequestGetStatsAndKill:

WARNING: ThreadSanitizer: data race (pid=21713)
  Read of size 8 at 0x00000455e1c8 by main thread:
    #0 evsig_handler /build/tmp/_bazel_bazel/b570b5ccd0454dc9af9f65ab1833764d/execroot/envoy/external/com_github_libevent_libevent/signal.c:385:6 (main_common_test+0x386f1c2)
    #1 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) <null> (main_common_test+0x116abb0)
    #2 Envoy::AdminRequestTest_AdminRequestGetStatsAndKill_Test::TestBody() /proc/self/cwd/test/exe/main_common_test.cc:249:3 (main_common_test+0x11edde6)
    #3 void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2424:10 (main_common_test+0x3a1eaa6)
    #4 void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2460:14 (main_common_test+0x3a03f01)
    #5 testing::Test::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2499:5 (main_common_test+0x39e406b)
    #6 testing::TestInfo::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2675:11 (main_common_test+0x39e51a9)
    #7 testing::TestSuite::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2803:28 (main_common_test+0x39e5c14)
    #8 testing::internal::UnitTestImpl::RunAllTests() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:5241:44 (main_common_test+0x39f8b14)
    #9 bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2424:10 (main_common_test+0x3a24976)
    #10 bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:2460:14 (main_common_test+0x3a07f17)
    #11 testing::UnitTest::Run() /proc/self/cwd/external/com_google_googletest/googletest/src/gtest.cc:4843:10 (main_common_test+0x39f857e)
    #12 RUN_ALL_TESTS() /proc/self/cwd/external/com_google_googletest/googletest/include/gtest/gtest.h:2499:46 (main_common_test+0x296f087)
    #13 Envoy::TestRunner::RunTests(int, char**) /proc/self/cwd/./test/test_runner.h:57:12 (main_common_test+0x296eca4)
    #14 main /proc/self/cwd/test/main.cc:39:10 (main_common_test+0x296df4b)

  Previous write of size 8 at 0x00000455e1c8 by thread T5 (mutexes: write M636550213812770000, write M2072):
    #0 evsig_set_base_ /build/tmp/_bazel_bazel/b570b5ccd0454dc9af9f65ab1833764d/execroot/envoy/external/com_github_libevent_libevent/signal.c:123:13 (main_common_test+0x386e4d0)
    #1 event_base_loop /build/tmp/_bazel_bazel/b570b5ccd0454dc9af9f65ab1833764d/execroot/envoy/external/com_github_libevent_libevent/event.c:1901:3 (main_common_test+0x385e084)
    #2 Envoy::Event::DispatcherImpl::run(Envoy::Event::Dispatcher::RunType) /proc/self/cwd/source/common/event/dispatcher_impl.cc:171:3 (main_common_test+0x2a6a74e)
    #3 Envoy::Server::InstanceImpl::run() /proc/self/cwd/source/server/server.cc:475:16 (main_common_test+0x1fc0d52)
    #4 Envoy::MainCommonBase::run() /proc/self/cwd/source/exe/main_common.cc:114:14 (main_common_test+0x126363f)
    #5 Envoy::MainCommon::run() /proc/self/cwd/bazel-out/k8-dbg/bin/source/exe/_virtual_includes/envoy_main_common_lib/exe/main_common.h:86:29 (main_common_test+0x11f4c8f)
    #6 Envoy::AdminRequestTest::startEnvoy()::'lambda'()::operator()() const /proc/self/cwd/test/exe/main_common_test.cc:195:35 (main_common_test+0x124a350)
    #7 std::_Function_handler<void (), Envoy::AdminRequestTest::startEnvoy()::'lambda'()>::_M_invoke(std::_Any_data const&) /usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/std_function.h:316:2 (main_common_test+0x1249faa)
    #8 std::function<void ()>::operator()() const /usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/std_function.h:706:14 (main_common_test+0x13f990e)
    #9 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void ()>)::$_0::operator()(void*) const /proc/self/cwd/source/common/common/posix/thread_impl.cc:38:35 (main_common_test+0x38add98)
    #10 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void ()>)::$_0::__invoke(void*) /proc/self/cwd/source/common/common/posix/thread_impl.cc:37:33 (main_common_test+0x38add28)

  Location is global 'evsig_base' of size 8 at 0x00000455e1c8 (main_common_test+0x00000455e1c8)

evsig_base seems like its correctly modeled as atomic, I think we need to wait until the main server thread is running before performing the kill.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions