-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Bug Template
Title: Envoy crashes when gRPC healthcheck service returns SERVICE_UNKNOWN
Description:
Any service that use gRPC healthcheck with Envoy can make envoy crash by returning SERVICE_UNKNOWN from the Check method.
Based on the spec, SERVICE_UNKNOWN should only be used in the Watch method. However, it is too dangerous to have Envoy crash because one endpoint did not implement correctly the Check method.
Repro steps:
- Add a cluster with gRPC health check configuration (see config below)
- Make the healthcheck endpoint return HealthCheckResponse_SERVICE_UNKNOWN from the
Checkmethod
Admin and Stats Output:
service::default_priority::max_connections::1024
service::default_priority::max_pending_requests::1024
service::default_priority::max_requests::1024
service::default_priority::max_retries::3
service::high_priority::max_connections::1024
service::high_priority::max_pending_requests::1024
service::high_priority::max_requests::1024
service::high_priority::max_retries::3
service::added_via_api::false
service::172.27.0.3:3000::cx_active::0
service::172.27.0.3:3000::cx_connect_fail::0
service::172.27.0.3:3000::cx_total::0
service::172.27.0.3:3000::rq_active::0
service::172.27.0.3:3000::rq_error::0
service::172.27.0.3:3000::rq_success::0
service::172.27.0.3:3000::rq_timeout::0
service::172.27.0.3:3000::rq_total::0
service::172.27.0.3:3000::hostname::service1
service::172.27.0.3:3000::health_flags::/failed_active_hc
service::172.27.0.3:3000::weight::1
service::172.27.0.3:3000::region::
service::172.27.0.3:3000::zone::
service::172.27.0.3:3000::sub_zone::
service::172.27.0.3:3000::canary::false
service::172.27.0.3:3000::priority::0
service::172.27.0.3:3000::success_rate::-1.0
service::172.27.0.3:3000::local_origin_success_rate::-1.0
Config:
admin:
access_log_path: /dev/null
address:
socket_address:
address: 0.0.0.0
port_value: 8000
node:
cluster: playground
id: localhost
static_resources:
clusters:
- name: service
http2_protocol_options: {}
type: STRICT_DNS
connect_timeout: 30s
health_checks:
- healthy_threshold: 1
grpc_health_check:
service_name: service
interval: 1s
no_traffic_interval: 10s
timeout: 10s
unhealthy_threshold: 1
dns_refresh_rate: "7200s"
common_lb_config:
healthy_panic_threshold:
value: 33
load_assignment:
cluster_name: service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: service1
port_value: 3000
listeners:
- address:
socket_address:
address: 0.0.0.0
port_value: 80
filter_chains:
- filters:
- config:
access_log:
config:
path: /dev/stdout
name: envoy.file_access_log
codec_type: AUTO
http_filters:
- config: {}
name: envoy.router
common_http_protocol_options:
idle_timeout: 30s
route_config:
name: playground
virtual_hosts:
domains:
- '*'
name: playground
routes:
- match:
prefix: /
route:
cluster: service
server_name: playground
stat_prefix: playground
name: envoy.http_connection_manager
name: playground
Logs and Call Stack:
envoy_1 | [2020-04-17 07:35:57.809][1][debug][main] [source/server/server.cc:177] flushing stats
envoy_1 | [2020-04-17 07:35:57.810][1][debug][client] [source/common/http/codec_client.cc:34] [C1] connecting
envoy_1 | [2020-04-17 07:35:57.810][1][debug][connection] [source/common/network/connection_impl.cc:727] [C1] connecting to 172.27.0.2:3000
envoy_1 | [2020-04-17 07:35:57.810][1][debug][connection] [source/common/network/connection_impl.cc:736] [C1] connection in progress
envoy_1 | [2020-04-17 07:35:57.811][1][debug][http2] [source/common/http/http2/codec_impl.cc:970] [C1] updating connection-level initial window size to 268435456
envoy_1 | [2020-04-17 07:35:57.812][1][debug][connection] [source/common/network/connection_impl.cc:592] [C1] connected
envoy_1 | [2020-04-17 07:35:57.812][1][debug][client] [source/common/http/codec_client.cc:72] [C1] connected
envoy_1 | [2020-04-17 07:35:57.817][1][debug][client] [source/common/http/codec_client.cc:104] [C1] response complete
envoy_1 | [2020-04-17 07:35:57.817][1][critical][assert] [source/common/upstream/health_checker_impl.cc:798] panic: not reached
envoy_1 | [2020-04-17 07:35:57.817][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Aborted, suspect faulting address 0x1
envoy_1 | [2020-04-17 07:35:57.817][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
envoy_1 | [2020-04-17 07:35:57.817][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 3504d40f752eb5c20bc2883053547717bcb92fd8/1.14.1/Clean/RELEASE/BoringSSL
envoy_1 | [2020-04-17 07:35:57.818][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #0: [0x7fc0b991a3d0]
envoy_1 | [2020-04-17 07:35:57.830][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #1: Envoy::Upstream::GrpcHealthCheckerImpl::GrpcActiveHealthCheckSession::onRpcComplete() [0x5585b28683d8]
envoy_1 | [2020-04-17 07:35:57.841][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #2: Envoy::Upstream::GrpcHealthCheckerImpl::GrpcActiveHealthCheckSession::decodeTrailers() [0x5585b2868b85]
envoy_1 | [2020-04-17 07:35:57.850][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #3: Envoy::Http::ResponseDecoderWrapper::decodeTrailers() [0x5585b2801500]
envoy_1 | [2020-04-17 07:35:57.859][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #4: Envoy::Http::Http2::ConnectionImpl::onFrameReceived() [0x5585b29158a8]
envoy_1 | [2020-04-17 07:35:57.867][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #5: nghttp2_session_mem_recv [0x5585b2581de0]
envoy_1 | [2020-04-17 07:35:57.876][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #6: Envoy::Http::Http2::ConnectionImpl::dispatch() [0x5585b2914eca]
envoy_1 | [2020-04-17 07:35:57.884][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #7: Envoy::Http::CodecClient::onData() [0x5585b2871fd8]
envoy_1 | [2020-04-17 07:35:57.893][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #8: Envoy::Http::CodecClient::CodecReadFilter::onData() [0x5585b2872d2d]
envoy_1 | [2020-04-17 07:35:57.902][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #9: Envoy::Network::FilterManagerImpl::onContinueReading() [0x5585b269b603]
envoy_1 | [2020-04-17 07:35:57.910][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #10: Envoy::Network::ConnectionImpl::onReadReady() [0x5585b2697745]
envoy_1 | [2020-04-17 07:35:57.919][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #11: Envoy::Network::ConnectionImpl::onFileEvent() [0x5585b2696a9d]
envoy_1 | [2020-04-17 07:35:57.927][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #12: Envoy::Event::FileEventImpl::assignEvents()::$_0::__invoke() [0x5585b2691aa6]
envoy_1 | [2020-04-17 07:35:57.936][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #13: event_process_active_single_queue [0x5585b2ad0ecb]
envoy_1 | [2020-04-17 07:35:57.945][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #14: event_base_loop [0x5585b2acf75e]
envoy_1 | [2020-04-17 07:35:57.953][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #15: Envoy::Server::InstanceImpl::run() [0x5585b2621d89]
envoy_1 | [2020-04-17 07:35:57.962][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #16: Envoy::MainCommonBase::run() [0x5585b1b474f8]
envoy_1 | [2020-04-17 07:35:57.970][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #17: main [0x5585b1b46132]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #18: __libc_start_main [0x7fc0b9767c8d]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:92] Envoy version: 3504d40f752eb5c20bc2883053547717bcb92fd8/1.14.1/Clean/RELEASE/BoringSSL
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:98] #0: [0x7fc0b991a3d0]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #1: Envoy::Upstream::GrpcHealthCheckerImpl::GrpcActiveHealthCheckSession::onRpcComplete() [0x5585b28683d8]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #2: Envoy::Upstream::GrpcHealthCheckerImpl::GrpcActiveHealthCheckSession::decodeTrailers() [0x5585b2868b85]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #3: Envoy::Http::ResponseDecoderWrapper::decodeTrailers() [0x5585b2801500]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #4: Envoy::Http::Http2::ConnectionImpl::onFrameReceived() [0x5585b29158a8]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #5: nghttp2_session_mem_recv [0x5585b2581de0]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #6: Envoy::Http::Http2::ConnectionImpl::dispatch() [0x5585b2914eca]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #7: Envoy::Http::CodecClient::onData() [0x5585b2871fd8]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #8: Envoy::Http::CodecClient::CodecReadFilter::onData() [0x5585b2872d2d]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #9: Envoy::Network::FilterManagerImpl::onContinueReading() [0x5585b269b603]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #10: Envoy::Network::ConnectionImpl::onReadReady() [0x5585b2697745]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #11: Envoy::Network::ConnectionImpl::onFileEvent() [0x5585b2696a9d]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #12: Envoy::Event::FileEventImpl::assignEvents()::$_0::__invoke() [0x5585b2691aa6]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #13: event_process_active_single_queue [0x5585b2ad0ecb]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #14: event_base_loop [0x5585b2acf75e]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #15: Envoy::Server::InstanceImpl::run() [0x5585b2621d89]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #16: Envoy::MainCommonBase::run() [0x5585b1b474f8]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #17: main [0x5585b1b46132]
envoy_1 | [2020-04-17 07:35:57.971][1][critical][backtrace] [bazel-out/k8-opt/bin/source/server/_virtual_includes/backtrace_lib/server/backtrace.h:96] #18: __libc_start_main [0x7fc0b9767c8d]