With the new indexing strategy, data sent from the Elastic Agent to Elasticsearch does not specify the ingest pipeline on the request, but each data stream already contains the ingest pipeline as setting. Currently we use index.default_pipeline but are thinking about also using index.final_pipeline for some final processing of the events. In some cases multiple pipeline are attached together with the pipeline processor. There are two use cases here:
- "Routing" ingest pipeline that selects the correct pipeline for processing (json vs text for example)
- Multiple pipelines connected together
The use cases around connecting multiple pipelines together I want to dive deeper here. In Ingest Manager we have at least three potential use cases for attaching multiple pipelines together:
- Reuse same pipeline in multiple places
- Let user inject their own enrichment pipeline
- Multiple teams want to add their own final ingest pipeline bits
Instead of ingest manager trying to modify the ingest pipelines to add multiple ingest pipelines together it would be nice if Elasticsearch would support an array of ingest pipelines and execute them in the order defined. Something like:
index.default_pipelines
- "pipeline1"
- "pipeline2"
The above would allow us to add a pipeline3 without having to modify the pipeline itself but only the settings. The same logic would apply to the final_pipelines so multiple teams could add their own final pipeline without conflicting with each other.
The above might also help with #57968. The discussion here is what happens if the target data stream changes. With the above, I would expect that the pipelines are just added to the list and all of them are executed.
With the new indexing strategy, data sent from the Elastic Agent to Elasticsearch does not specify the ingest pipeline on the request, but each data stream already contains the ingest pipeline as setting. Currently we use
index.default_pipelinebut are thinking about also usingindex.final_pipelinefor some final processing of the events. In some cases multiple pipeline are attached together with the pipeline processor. There are two use cases here:The use cases around connecting multiple pipelines together I want to dive deeper here. In Ingest Manager we have at least three potential use cases for attaching multiple pipelines together:
Instead of ingest manager trying to modify the ingest pipelines to add multiple ingest pipelines together it would be nice if Elasticsearch would support an array of ingest pipelines and execute them in the order defined. Something like:
The above would allow us to add a
pipeline3without having to modify the pipeline itself but only the settings. The same logic would apply to thefinal_pipelinesso multiple teams could add their own final pipeline without conflicting with each other.The above might also help with #57968. The discussion here is what happens if the target data stream changes. With the above, I would expect that the pipelines are just added to the list and all of them are executed.