-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Description:
We are investigating some odd panic routing metrics that envoy is emitting. As far as we can tell nothing is actually wrong with our request pipelines, but we want to trace this down as its causing spurious alerts.
We can clearly correlate cluster membership changes:
With panic routing
Our stats all say that all attempted health checks passed, there are no failures:
We added health check event logging and can see that we get
add_healthy_event: {
first_check: true
}
When the cluster membership changes.
I was opening this issue to see if there was any insight as to what could be causing this, or suggestions into debugging attempts.
Our relevant cluster configuration looks like:
"dynamic_active_clusters": [
{
"version_info": "d63a8ee91ca7f647e623c3c5113a61d62be6fc23e09dbd2b73a7dc85a2e50e37",
"cluster": {
"name": "internal_cluster",
"type": "STRICT_DNS",
"connect_timeout": "2s",
"health_checks": [
{
"timeout": "3s",
"interval": "4s",
"unhealthy_threshold": 2,
"healthy_threshold": 2,
"http_health_check": {
"path": "/healthcheck"
},
"no_traffic_interval": "4s",
"event_log_path": "/var/log/envoy_health_event.log"
}
"http2_protocol_options": {},
"upstream_connection_options": {
"tcp_keepalive": {
"keepalive_time": 120
}
},
"load_assignment": {
"cluster_name": "apiori",
"endpoints": [
{
"lb_endpoints": [
{
"endpoint": {
"address": {
"socket_address": {
"address": "def.dns.entry",
"port_value": 10652
}
}
},
"load_balancing_weight": 50
},
{
"endpoint": {
"address": {
"socket_address": {
"address": "abc.dns.entry",
"port_value": 10652
}
}
},
"load_balancing_weight": 50
}
]
},
{
"lb_endpoints": [
{
"endpoint": {
"address": {
"socket_address": {
"address": "xyz.dns.entry",
"port_value": 10652
}
}
},
"load_balancing_weight": 100
}
],
"priority": 1
},
{
"lb_endpoints": [
{
"endpoint": {
"address": {
"socket_address": {
"address": "xyz.dns.entry",
"port_value": 10652
}
}
},
"load_balancing_weight": 100
}
],
"priority": 2
}
],
"policy": {
"overprovisioning_factor": 198
}
}
},
"last_updated": "2019-04-18T21:13:33.619Z"
},
When we do cluster changes via a connected CDS we update the clusters LB endpoints to point to different dns entries, but otherwise everything stays the same.
We are on envoy version: envoy 0/1.9.0-dev//RELEASE live 1394162 3549119 1
Thanks!


