Skip to content
This repository was archived by the owner on Sep 21, 2023. It is now read-only.
This repository was archived by the owner on Sep 21, 2023. It is now read-only.

Implement more efficient output tuning parameters to manage throughput #28

@nimarezainia

Description

@nimarezainia

Beats have many knobs and whistles that allow the user to modify output related parameters in order to increase throughput. These parameters are extremely convoluted and sometimes contradict one another. With the new shipper design we have the opportunity to simplify and create more meaningful parameters for users to use.

Performance Tuning Proposal

  1. Change bulk_max_size to maximum_batch_size to be more meaningful. maximum_batch_size is the total batch size in bytes
  2. Allow the user to modify the maximum_batch_size in the UI. Specify maximum_batch_size to be in bytes rather than events.
    a. Bytes are easier to mentally consume
    b. It’s also easier to map to data seen on the wire
    c. On the Elasticsearch ingest, the max document size is configured in bytes
  3. Introduce a NEW variable output_queue_flush_timeout
    a. Upon expiry the output queue is flushed and data written to the output
    b. Users can lower this timeout to reduce the delay in collecting data

In summary for tuning the output we now will have 2 variables: maximum_batch_size and output_queue_flush_timeout

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions