When event batches targeting the shipper exceed the RPC limit (currently 4MB), the shipper output drops all the events with an error similar to the following:
{"log.level":"error","@timestamp":"2023-02-28T15:52:14.521Z","message":"failed to publish events: failed to publish the batch to the shipper, none of the 2048 events were accepted: rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4692625 vs. 4194304)","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"filestream-monitoring","type":"filestream"},"log":{"source":"filestream-monitoring"},"log.origin":{"file.line":176,"file.name":"pipeline/client_worker.go"},"service.name":"filebeat","ecs.version":"1.6.0","log.logger":"publisher_pipeline_output","ecs.version":"1.6.0"}
There are multiple short-term mitigations (increase the RPC size limit; decrease the shipper output's batch size) but both of those approaches can still drop data unpredictably in the case of large events. If an event size is supported, then we should handle this error by splitting up the batch rather than permanently dropping its contents.
When event batches targeting the shipper exceed the RPC limit (currently 4MB), the shipper output drops all the events with an error similar to the following:
{"log.level":"error","@timestamp":"2023-02-28T15:52:14.521Z","message":"failed to publish events: failed to publish the batch to the shipper, none of the 2048 events were accepted: rpc error: code = ResourceExhausted desc = grpc: received message larger than max (4692625 vs. 4194304)","component":{"binary":"filebeat","dataset":"elastic_agent.filebeat","id":"filestream-monitoring","type":"filestream"},"log":{"source":"filestream-monitoring"},"log.origin":{"file.line":176,"file.name":"pipeline/client_worker.go"},"service.name":"filebeat","ecs.version":"1.6.0","log.logger":"publisher_pipeline_output","ecs.version":"1.6.0"}There are multiple short-term mitigations (increase the RPC size limit; decrease the shipper output's batch size) but both of those approaches can still drop data unpredictably in the case of large events. If an event size is supported, then we should handle this error by splitting up the batch rather than permanently dropping its contents.