I asked about the difference between "filebeat setup" and "filebeat --setup" which lead to the realization that once you add logstash to the equation, the setup and maintenance of filebeat ingest nodes becomes significantly harder. I was asked to open an issue against both repos to improve the documentation.
Background
Filebeat ingest nodes in ES work great when filebeat sends data to an Elasticsearch node. Once logstash is introduced to the environment, the ingest nodes are no longer used. The documentation hints that ingest nodes or logstash can be used to process data but doesn't explain how to use both or the consequences of implicitly not using ingest nodes when logstash is used.
Logstash without filebeat ingest nodes
The usual setup is to not use the ingest nodes created by filebeat. Here are the steps needed to achieve the same processing.
- Run
filebeat setup to setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. This doesn't have to be done on the same server as logstash.
- Use the ingest converter to convert the ingest node to a logstash pipeline. The ingest node can be found in several locations.
- Elasticsearch - Enable the filebeat module, run
filebeat --setup with filebeat connected to ES and not logstash, wait for filebeat to create the ingest node by sending events to ES, then "GET /_ingest/pipeline/" in Kibana Console to see and download the ingest node.
- Filesystem - The injest JSON is with the filebeat modules,
/usr/share/filebeat/module in openSUSE (RPM based). This option implies filebeat is installed on the same server as logstash.
- Online - Download the latest JSON from the beats repo.
- Examples - Download the configuration examples. Being examples, they might not work as expected.
- Modify the logstash pipeline as necessary.
Logstash with filebeat ingest nodes
Using the filebeat ingest nodes with logstash is significantly harder because you need to match the incoming message with the correct ingest node. Multiple message types with multiple ingest nodes means exponential management.
- Run
filebeat --setup to setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. Wait for filebeat to create the ingest node by sending events to ES, then kill filebeat. This doesn't have to be done on the same server as logstash.
- Use
pipeline option in the elasticsearch output block to specify the ingest node to use.
pipeline is a string and can accept only 1 value.
- The
elasticsearch block does not accept conditionals.
- Due to items 1 and 2, conditionals need to be used to match each message to exactly 1 elasticsearch block that specifies the pipeline to use. If you have 5 ingest nodes, you'll need 6 elasticsearch blocks, one for each ingest node and possibly a 6th one as a default.
- By default, the filebeat ingest node name contains the beat name, beat version, module name and some other identifiers. For example, "filebeat-6.1.2-nginx-access-default". An envinronment with multiple filebeat versions means even more conditionals to match the message to the ingest node.
- If the pipeline doesn't exist, an error is thrown and the message is not added to ES.
- Variables can be used to ease the configuration but if something goes wrong, an error is logged and the message is thrown away.
- The
elasticsearch filter plugin can query ES for the pipeline and possibly make the above bearable. The downside is that you're querying ES for every message.
- The filebeat version changes upon upgrade.
- If you don't run
filebeat --setup to update the ingest node, the configuration mentioned above cannot rely on the filebeat version.
- If you run
filebeat --setup, the configuration will need to be updated to match the version. This should be done for every major version to make sure the ingest node matches the filebeat version.
Conclusion
Using filebeat modules without logstash is a breeze. Using filebeat with logstash requires additional setup but the documentation is lacking what that setup is. The logstash documentation has a section on working with Filebeat Modules but doesn't elaborate how or why the examples are important. As a user, it was very frustrating trying to understand why the ingest nodes weren't working when I was using Logstash.
Versions
This issue is specific to filebeat since the other beats do not use ingest piplines.
- openSUSE Leap 42.3
- Filebeat 6.1.2
- Logstash 6.1.2
- Elasticsearch 6.1.2
I asked about the difference between "filebeat setup" and "filebeat --setup" which lead to the realization that once you add logstash to the equation, the setup and maintenance of filebeat ingest nodes becomes significantly harder. I was asked to open an issue against both repos to improve the documentation.
Background
Filebeat ingest nodes in ES work great when filebeat sends data to an Elasticsearch node. Once logstash is introduced to the environment, the ingest nodes are no longer used. The documentation hints that ingest nodes or logstash can be used to process data but doesn't explain how to use both or the consequences of implicitly not using ingest nodes when logstash is used.
Logstash without filebeat ingest nodes
The usual setup is to not use the ingest nodes created by filebeat. Here are the steps needed to achieve the same processing.
filebeat setupto setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. This doesn't have to be done on the same server as logstash.filebeat --setupwith filebeat connected to ES and not logstash, wait for filebeat to create the ingest node by sending events to ES, then "GET /_ingest/pipeline/" in Kibana Console to see and download the ingest node./usr/share/filebeat/modulein openSUSE (RPM based). This option implies filebeat is installed on the same server as logstash.Logstash with filebeat ingest nodes
Using the filebeat ingest nodes with logstash is significantly harder because you need to match the incoming message with the correct ingest node. Multiple message types with multiple ingest nodes means exponential management.
filebeat --setupto setup the environment in ES and Kibana. This loads the index templates, Kibana dashboards and ML jobs. Wait for filebeat to create the ingest node by sending events to ES, then kill filebeat. This doesn't have to be done on the same server as logstash.pipelineoption in theelasticsearchoutput block to specify the ingest node to use.pipelineis a string and can accept only 1 value.elasticsearchblock does not accept conditionals.elasticsearchfilter plugin can query ES for the pipeline and possibly make the above bearable. The downside is that you're querying ES for every message.filebeat --setupto update the ingest node, the configuration mentioned above cannot rely on the filebeat version.filebeat --setup, the configuration will need to be updated to match the version. This should be done for every major version to make sure the ingest node matches the filebeat version.Conclusion
Using filebeat modules without logstash is a breeze. Using filebeat with logstash requires additional setup but the documentation is lacking what that setup is. The logstash documentation has a section on working with Filebeat Modules but doesn't elaborate how or why the examples are important. As a user, it was very frustrating trying to understand why the ingest nodes weren't working when I was using Logstash.
Versions
This issue is specific to filebeat since the other beats do not use ingest piplines.