-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
From the documentation on retries, the x-envoy-retry-on header can be configured for handling HTTP status codes like 5XX and 4XX. This works great for HTTP services. However, with gRPC all HTTP status codes (if the server is running properly) will return a 200 OK. The actual error code is found within the gRPC response.
Would it be feasible to create a retry policy for a x-envoy-retry-grpc-on header that respects a list of gRPC error codes? There are a few codes that could be deemed as retriable CANCELLED, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED but I am hesitant to make assumptions about implementation details within a service by grouping them together. Which is why I think a list may work best. I'm open to ideas here.
Sample Header
x-envoy-retry-grpc-on: CANCELLED, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED
The existing retry header would then be configured as a fallback for when a service is unreachable and the HTTP status codes have more meaning.