Setup: Add default pipeline to templates#14001
Conversation
libbeat/template/pipeline.go
Outdated
There was a problem hiding this comment.
This is cool. I wonder if event.ingested should be in ECS. CC @MikePaquette @webmat
libbeat/template/template.go
Outdated
There was a problem hiding this comment.
The pipeline itself requires the Geoip plugin, which is only bundled into ES since 6.7. Might be worth checking for that.
There was a problem hiding this comment.
Ah, makes sense - changing to 6.7.0. I confirmed it works out of the box with that Elasticsearch version.
|
Nice work @cwurm! I can't think of anything that might get broken due to this, so I'm supportive to have it enabled by default unless someone else thinks of something that breaks. |
|
Haven't reviewed in detail, but I love this! BTW we've been tracking ideas of what can be done, with events that are ECS formatted. Check out this issue elastic/ecs#181. You don't have to change anything here, but just in case it sparks more ideas of things general enough to include here eventually :-) |
andrewkroh
left a comment
There was a problem hiding this comment.
I'm just commenting on a few operational aspects. I didn't review the code.
Will beat setup --index-management load this pipeline?
The documentation for installing the index template when using a non-ES output will need to be updated to include how to install the pipeline. We might need an export default-pipeline sub-command to get a copy of the pipeline JSON.
libbeat/template/config.go
Outdated
There was a problem hiding this comment.
I think the pipeline name should have version number in it.
There was a problem hiding this comment.
And I like the idea of using the beat's name rather than "beats".
There was a problem hiding this comment.
I think the pipeline name should have version number in it.
Good catch, definitely. We'll need it when running multiple versions of Beats.
And I like the idea of using the beat's name rather than "beats".
👍 - now the default pipeline for all Beats would be the same, but in the future this might well diverge.
Yes
Good point, let me implement |
|
I've implemented Would like to do more testing on this, and need to add it in some places in the docs, but I think the code is ready to take a look at. |
faec
left a comment
There was a problem hiding this comment.
Thank you, this is a lovely change! I have only very minor comments on what's here, but please add some unit tests for the libbeat/template changes :-)
libbeat/template/pipeline.go
Outdated
libbeat/template/pipeline.go
Outdated
There was a problem hiding this comment.
Please either include the error result here as above, or add a comment why it is skipped in this case
There was a problem hiding this comment.
Thanks, I included the error
9055be8 to
7aba709
Compare
|
@faec Thanks, I've addressed your comments and added Let me know what you think. :-) |
|
Now that we have a If we change to What do others think? @urso @tsg @andrewkroh @faec? |
|
If we make it a Other than that, |
7aba709 to
5160f91
Compare
5160f91 to
986f3ac
Compare
986f3ac to
9ad7b8a
Compare
|
@tsg @urso @andrewkroh @faec I've updated this to set |
andrewkroh
left a comment
There was a problem hiding this comment.
This looks pretty good to me. The final_pipeline concept is very useful. I didn't scrutinize the code, but the pipeline looks good.
|
|
||
| "github.com/elastic/beats/libbeat/common" | ||
| "github.com/elastic/beats/libbeat/common/cfgwarn" | ||
| commonP "github.com/elastic/beats/libbeat/common/pipeline" |
There was a problem hiding this comment.
Package names should be lowercase.
|
APM Server currently has a mechanism to load a set of default pipelines to ES on startup and applying them by default on event ingestion. This current approach does not use index pipelines, therefore it is definitely interesting to investigate how to best deprecate and move over to this new handling. I can look into that in more detail. The pipelines partially overlap with the default pipelines introduced in this PR, but APM Server additionally createa a user-agent pipeline. Not necessarily a requirement for this PR, but have you thought about how a beat could extend the |
| if v[idx], err = fixYAMLMaps(value); err != nil { | ||
| return nil, err | ||
| } | ||
| } |
There was a problem hiding this comment.
let's add a default case erroring out, just in case.
| } | ||
|
|
||
| // LoadFinalPipeline loads the final pipeline. | ||
| func (l *ESLoader) LoadFinalPipeline(config TemplateConfig, info beat.Info) error { |
There was a problem hiding this comment.
I think we are missing a 'good' package for loading pipelines. Maybe we can introduce it to the idxmgmt package instead of template? The idxmgmt package configures lifecycle.name and lifecycle.rollover_alias when creating the TemplateConfig.
|
@simitt Would this idea fit apm-server needs? If so, do you think we can proceed here, and create a follow up issue to make the final pipeline more extensible? Depending on the capabilities of the pipeline we might need a version check, plus the pipeline contents. One option to make it somewhat extensible would be to pass the default final pipeline to All setup should rely on this type being passed where it is required. For pipeline handling in general we might want to introduce a package named |
|
@urso your suggestion generally sounds reasonable to me. Regarding moving forward with this PR - it is introducing some breaking changes to exported interfaces in the |
|
Hi! We're labeling this issue as |
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
|
Hi! We're labeling this issue as |
|
Hi! |
Adds the
default_pipelineindex setting (ES docs) to the templates loaded onsetup.Enabled by default, Libbeat first loads the ingest pipeline, then adds it to the template and loads the template. The built-in ingest pipeline contains processors for GeoIP and ASN lookups (and renaming to ECS fields) and adds an ingest timestamp. These are especially relevant for the SIEM app, where several widgets (network map, top IP tables, IP details overview) depend on GeoIP and ASN data being present. The ingest timestamp will help with detecting events that arrive much later than they occur or with a wrong timestamp (more details in elastic/ecs#582).
This feature is governed by four settings:
setup.template.default_pipeline.enabled- if it is enabled at all (defaulttrue)setup.template.default_pipeline.overwrite- if an existing pipeline with the same name should be overwrittensetup.template.default_pipeline.name- allows specifying a custom pipeline name (the default isbeats_default_pipeline)setup.template.default_pipeline.file- instead of using the built-in default pipeline users can also load a custom pipeline from a file. I've re-used the unmarshalling logic from filesets so both JSON and YAML files can be used.A few other notes:
beats_default_pipelinepipeline for all Beats we could also do Beats-specific pipelines (packetbeat_default_pipeline,auditbeat_default_pipeline, etc.). I don't have a strong opinion.