Improve documentation of Logstash pipelines vs Filebeat ingest nodes

I asked about the [difference between "filebeat setup" and "filebeat --setup"](https://discuss.elastic.co/t/difference-between-filebeat-setup-and-filebeat-setup/117307) which lead to the realization that once you add logstash to the equation, the setup and maintenance of filebeat ingest nodes becomes significantly harder. I was asked to open an issue against both repos to improve the documentation.

# Background
Filebeat ingest nodes in ES work great when filebeat sends data to an Elasticsearch node. Once logstash is introduced to the environment, the ingest nodes are no longer used. The documentation [hints](https://www.elastic.co/guide/en/beats/filebeat/current/configuring-ingest-node.html) that ingest nodes *or* logstash can be used to process data but doesn't explain how to use both or the consequences of implicitly not using ingest nodes when logstash is used.

# Logstash without filebeat ingest nodes
The usual setup is to not use the ingest nodes created by filebeat. Here are the steps needed to achieve the same processing.

1. Run `filebeat setup` to setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. This doesn't have to be done on the same server as logstash.
2. Use the [ingest converter](https://www.elastic.co/guide/en/logstash/current/ingest-converter.html) to convert the ingest node to a logstash pipeline. The ingest node can be found in several locations.
    1. Elasticsearch - Enable the filebeat module, run `filebeat --setup` with filebeat connected to ES and not logstash, wait for filebeat to create the ingest node by sending events to ES, then "GET /_ingest/pipeline/" in Kibana Console to see and download the ingest node.
    2. Filesystem - The injest JSON is with the filebeat modules, `/usr/share/filebeat/module` in openSUSE (RPM based). This option implies filebeat is installed on the same server as logstash.
    3. Online - Download the latest JSON from the [beats repo](https://github.com/elastic/beats/tree/master/filebeat/module).
    4. Examples - Download the [configuration examples](https://www.elastic.co/guide/en/logstash/6.1/logstash-config-for-filebeat-modules.html). Being examples, they might not work as expected.
3. Modify the logstash pipeline as necessary.

# Logstash with filebeat ingest nodes
Using the filebeat ingest nodes with logstash is significantly harder because you need to match the incoming message with the correct ingest node. Multiple message types with multiple ingest nodes means exponential management.

1. Run `filebeat --setup` to setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. Wait for filebeat to create the ingest node by sending events to ES, then kill filebeat. This doesn't have to be done on the same server as logstash.
2. Use [`pipeline`](https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-pipeline) option in the `elasticsearch` output block to specify the ingest node to use.
    1. `pipeline` is a string and can accept only 1 value.
    2. The `elasticsearch` block does not accept conditionals.
    3. Due to items 1 and 2, conditionals need to be used to match each message to exactly 1 elasticsearch block that specifies the pipeline to use. If you have 5 ingest nodes, you'll need 6 elasticsearch blocks, one for each ingest node and possibly a 6th one as a default.
    4. By default, the filebeat ingest node name contains the beat name, beat version, module name and some other identifiers. For example, "filebeat-6.1.2-nginx-access-default". An envinronment with multiple filebeat versions means even more conditionals to match the message to the ingest node.
    5. If the pipeline doesn't exist, an error is thrown and the message is not added to ES.
    6. Variables can be used to ease the configuration but if something goes wrong, an error is logged and the message is thrown away.
    7. The [`elasticsearch`](https://www.elastic.co/guide/en/logstash/current/plugins-filters-elasticsearch.html) filter plugin can query ES for the pipeline and possibly make the above bearable. The downside is that you're querying ES for *every message*.
3. The filebeat version changes upon upgrade.
    1. If you don't run `filebeat --setup` to update the ingest node, the configuration mentioned above cannot rely on the filebeat version.
    2. If you run `filebeat --setup`, the configuration will need to be updated to match the version. This should be done for every major version to make sure the ingest node matches the filebeat version.

# Conclusion
Using filebeat modules **without logstash** is a breeze. Using filebeat **with logstash** requires additional setup but the documentation is lacking what that setup is. The logstash documentation has a section on [working with Filebeat Modules](https://www.elastic.co/guide/en/logstash/current/logstash-config-for-filebeat-modules.html) but doesn't elaborate how or why the examples are important. As a user, it was very frustrating trying to understand why the ingest nodes weren't working when I was using Logstash.

# Versions

This issue is specific to filebeat since the other beats do not use ingest piplines.

* openSUSE Leap 42.3
* Filebeat 6.1.2
* Logstash 6.1.2
* Elasticsearch 6.1.2


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve documentation of Logstash pipelines vs Filebeat ingest nodes #6280

Background

Logstash without filebeat ingest nodes

Logstash with filebeat ingest nodes

Conclusion

Versions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Improve documentation of Logstash pipelines vs Filebeat ingest nodes #6280

Description

Background

Logstash without filebeat ingest nodes

Logstash with filebeat ingest nodes

Conclusion

Versions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions