Skip to content

Beats pipeline doesn't respect configured batch sizes on startup under agent #34703

@faec

Description

@faec

When an output worker is created, it specifies the maximum size of event batches it should receive from the pipeline. This value is ultimately propagated back to eventConsumer, the routine that assembles batches for the output workers, which uses it for its queue requests. Most outputs accept this batch size as a configuration parameter e.g. bulk_max_size.

Under Elastic Agent, the Beats startup is more complicated since Agent sends the beat configuration in multiple stages, and there will generally not be an output on the first initialization. Currently, this leads to eventConsumer receiving four separate calls to update the batch size (in each beat) -- three setting it to zero, and one setting it to the actual value requested by the output.

While the final value is correct, the inputs may have already started up by that point. Since a value of 0 indicates to the queue that it should send as many events as are available, this can cause the pipeline to be primed with batches containing multiple thousands of events before the output is initialized, even if the output itself requests a relatively small value (e.g. the shipper output defaults to a batch size of 50).

This is notably a problem for the Elasticsearch and Shipper outputs (and possibly others), which can have upstream caps on batch size, causing them to either drop the entire batch or to enter a retry loop that stalls the ingestion pipeline (#29778 #34695).

We need to correct the initialization process so eventConsumer doesn't begin creating batches until a valid output is configured; this will still allow incoming data to accumulate in the queue, but no explicit batches should be created until we know what the output workers can accept.

(This issue currently causes repeatable pipeline deadlocks for me when targeting the shipper.)

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions