Skip to content

Update broker/pipeline setup#4650

Merged
ruflin merged 1 commit intoelastic:masterfrom
urso:pipeline/config
Jul 12, 2017
Merged

Update broker/pipeline setup#4650
ruflin merged 1 commit intoelastic:masterfrom
urso:pipeline/config

Conversation

@urso
Copy link
Copy Markdown

@urso urso commented Jul 11, 2017

  • remove settings: filebeat.spooler, filebeat.publish_async, filebeat.idle_timeout, queue_size, bulk_queue_size
  • introduce new setting: queue.<queue_type>
  • expose memory queue (type = 'mem') settings: events, flush.min_events, flush.timeout

Note: docs will be updated in a separate PR.

@urso urso added the in progress Pull request is currently in progress. label Jul 11, 2017
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange that we call it here queue and in the code broker. This will need some docs and changelog entry. But can happen in a follow up PR. Does this have to be in this PR?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the user-visible settings. I decided to rename the broker package to queue, but first I want to get the changes to the settings in and documented, so we won't introduce BC-breaking changes in case package updates PR gets merged late.

@urso urso force-pushed the pipeline/config branch 2 times, most recently from fe27bb3 to cc174ea Compare July 11, 2017 12:09
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/send/sent/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to be sent" is correct

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when this is reached? Flush happens? Thinking if there should be flush.min, flush.max and flush.timeout or similar?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics is a little different from old spooler here. The queue/broker does not flush its buffers. By default it basically stream the events right through to the outputs (if output takes somewhat longer it will be presented with all events available - up to bulk_max_size).

By configuring min_events, the queue will block the output until:

  • at least min events is reached
  • timeout is trigger

The min events is more of a hint, as the output might still receive a smaller buffer.

Copy link
Copy Markdown
Author

@urso urso Jul 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The events setting is the maximum number of events the queue can hold. It's independent of min_events and flush timer. Only min_events must be < events.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is a little weird here... as it's more like 'block-processing' settings...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@urso what happens when events is reached? I assume it will block from adding more events? In case of metricbeat assuming sending events is slower then generating events but still all events are acked. What will happen?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once events is reached, the queue will block (no more space). Upon ACK of N events, another N events can be stored in the queue (N producers will be unblocked).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As quickly discussed, later we should probably rename broker package to queue to make the relation to the config more obvious.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the 3 params could be a config struct for borkers.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open a PR to delete this block of code from Winlogbeat so that queue isn't rejected.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hehe, thanks.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR is adjusting winlogbeat.

@urso urso force-pushed the pipeline/config branch 3 times, most recently from 57756c6 to b3f1b82 Compare July 12, 2017 02:19
@urso urso changed the title [WIP] update broker/pipeline setup Update broker/pipeline setup Jul 12, 2017
@urso urso force-pushed the pipeline/config branch from b3f1b82 to 61f5b7e Compare July 12, 2017 03:00
@urso urso mentioned this pull request Jul 12, 2017
22 tasks
@urso urso added needs_docs review and removed in progress Pull request is currently in progress. labels Jul 12, 2017
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a message to the 5.6 release that these values will change?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, we should probably have a message in 5.6 or even better, not let the beat start in 6.0 if these values are set?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we changed the default to 5? Should probably be mentioned in the changelog?

Copy link
Copy Markdown
Author

@urso urso Jul 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • pipeline can operate complete async -> take advantage of this fact
  • with default settings we have a high chance to publish very small batches (no flush timeout -> reduced latency) if producers don't have a high enough rate. In Logstash case we can easily compensate for this by pushing more batches on the link

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated changelog.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see this one removed ;-)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of surprised to see a minimum here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, we need to buffer some data for pushing to the outputs. Without queue the pipeline won't be able to operate -> enforce minimum so users don't use 1 or 0.

- remove setting queue_size
- remove unused setting bulk_queue_size
- introduce settings namespace `queue` with default `mem.events: 4096`
- add settings queue.mem.flush.min_events and queue.mem.flush.timeout
- change default output.logstash.pipelining to 5
- remove spooler settings from filebeat config object
- Update winlogbeat config validate check
@urso urso force-pushed the pipeline/config branch from 61f5b7e to cd45c0a Compare July 12, 2017 13:29
@urso
Copy link
Copy Markdown
Author

urso commented Jul 12, 2017

Translating old configurations:

  • rename queue_size -> queue.mem.events * 2
  • remove bulk_queue_size
  • rename or remove filebeat.spool_size -> queue.mem.flush.min_events
  • rename or remove filebeat.idle_timeout -> queue.mem.flush.timeout

@ruflin ruflin merged commit 7a6a5dd into elastic:master Jul 12, 2017
@urso urso deleted the pipeline/config branch July 19, 2017 13:18
@tsg tsg mentioned this pull request Jul 24, 2017
28 tasks
@dedemorton dedemorton mentioned this pull request Jul 25, 2017
42 tasks
@dedemorton
Copy link
Copy Markdown
Contributor

I'm removing the needs_docs label here because I believe we've covered the doc requirements with the topic https://www.elastic.co/guide/en/beats/filebeat/current/configuring-internal-queue.html and the removed settings that are documented in the breaking changes for the release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants