-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Closed
Labels
Description
We've noticed a segfault in our envoys when updating a cluster via CDS from using plain TCP to TLS with validation context provided via SDS.
Attached are the initial version, the update that caused the segfault and the printed stack, the only thing added is the transport socket configuration pointing to SDS.
previous_version.txt
updated_version.txt
stack.txt
Some comments:
- Crashing envoys are registered to updates of this cluster but not all are actively handling traffic for this cluster
- Not all our envoys subscribed to this cluster crash, haven't found a relation yet.
- We've had this happen in envoys running 1.14.1 and 1.15 from master, but we've seen similar issues since envoy 1.12 I recall.
- We're using DeltaXds with ADS and everything in v3.
- The SDS resource named "validation_context" is used by multiple clusters and is actually loaded statically by envoy during bootstrap
- The logs are from an envoy running 1.15 built from master, our logging system looses order, so the logs could be somehow unordered.
- Envoy was not built with debugging symbols so we cannot translate to lines, but we'll try to reproduce this.
Reactions are currently unavailable