-
Notifications
You must be signed in to change notification settings - Fork 313
Improve or Clarify Backpressure on OTel-Grpc-Sources #4119
Description
Is your feature request related to a problem? Please describe.
Ingestion of events via Opentelemetry over GRPC can fail due to a full buffer or an open CircuitBreaker.
In that case, an error response is sent. When testing with the OpenTelemetry Collector, this leads to dropped data by the collector.
Reason for this is, that apparently the OTLP/gRPC Throttling specification is not used properly. The DataPrepper gRPC response needs to contain a proper RetryInfo.
Describe the solution you'd like
DataPrepper should implement proper OTLP/gRPC throttling, when the ingress buffers are full or the circuit breaker is open.
Describe alternatives you've considered (Optional)
DataPrepper can document, that on full buffers or open circuit breakers, requests are rejected as non-retryable. This can happen for both cases or separately just for one, e.g. open circuit breaker.
Additional context
Currently on an open curcuit breaker, the error message from the CircuitBreakingBuffer is contained in the DataPrepper response.
This is managed by the GrpcRequestExceptionHandler, that generates a RESOURCE_EXHAUSTEDresponse status. This is only retryable, if a RetryInfo is present according to https://opentelemetry.io/docs/specs/otlp/#failures. This is currently not the case, hence the OpenTelemetry collector drops the events and does no retries.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status