Skip to content

Make ProcessImpl::run() more robust #317

@oschaaf

Description

@oschaaf

Observed a CI run associated to #316

[ RUN      ] ClientTest.AutoConcurrencyRun
[14:11:50.442223][31224][C] [source/client/process_impl.cc:124] assert failure: shutdown_. Details: shutdown not called before destruction.
[14:11:50.448097][31224][C] [bazel-out/k8-dbg/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:83] Caught Aborted, suspect faulting address 0x79f8
[14:11:50.448206][31224][C] [bazel-out/k8-dbg/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:70] Backtrace (use tools/stack_decode.py to get line numbers):
[14:11:50.449640][31224][C] [bazel-out/k8-dbg/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:71] Envoy version: 0/1.14.0-dev/redacted/DEBUG/BoringSSL
[14:11:50.670944][31224][C] [bazel-out/k8-dbg/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:75] #0: Envoy::SignalAction::sigHandler() [0x683abe7]
[14:11:50.866130][31224][C] [bazel-out/k8-dbg/bin/external/envoy/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:75] #1: __tsan::CallUserSignalHandler() [0x2789350]
external/bazel_tools/tools/test/test-setup.sh: line 310: 31224 Aborted                 "${TEST_PATH}" "$@" 2>&1

Full log: https://circleci.com/api/v1.1/project/github/envoyproxy/nighthawk/11758/output/102/0?file=true&allocation-id=5e679b96b55f4d48eecdfc16-0-build%2F5ECCDF5B

The assert may or may not be related to the Envoy update in #316, it has been first observed there.

The assert that gets triggered resides here:

RELEASE_ASSERT(shutdown_, "shutdown not called before destruction.");

However, after scrutinizing the code over at

shutdown_ = false;
I think it the most likely thing that happened is that there was an exception in ProcessImpl::run() in somewhere after shutdown_ gets set to false. This code should be revisited to properly unwind in the (unlikely) case of partial initialization. At the very least it should log a fatal message when an exception gets thrown halfway, so that the message associated tot he exception doesn't get squelched by the assert above.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions