Skip to content

Improve or Clarify Backpressure on OTel-Grpc-Sources #4119

@KarstenSchnitter

Description

@KarstenSchnitter

Is your feature request related to a problem? Please describe.
Ingestion of events via Opentelemetry over GRPC can fail due to a full buffer or an open CircuitBreaker.
In that case, an error response is sent. When testing with the OpenTelemetry Collector, this leads to dropped data by the collector.
Reason for this is, that apparently the OTLP/gRPC Throttling specification is not used properly. The DataPrepper gRPC response needs to contain a proper RetryInfo.

Describe the solution you'd like
DataPrepper should implement proper OTLP/gRPC throttling, when the ingress buffers are full or the circuit breaker is open.

Describe alternatives you've considered (Optional)
DataPrepper can document, that on full buffers or open circuit breakers, requests are rejected as non-retryable. This can happen for both cases or separately just for one, e.g. open circuit breaker.

Additional context
Currently on an open curcuit breaker, the error message from the CircuitBreakingBuffer is contained in the DataPrepper response.
This is managed by the GrpcRequestExceptionHandler, that generates a RESOURCE_EXHAUSTEDresponse status. This is only retryable, if a RetryInfo is present according to https://opentelemetry.io/docs/specs/otlp/#failures. This is currently not the case, hence the OpenTelemetry collector drops the events and does no retries.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingquestionFurther information is requested

    Type

    No type

    Projects

    Status

    Done

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions