Set pipeline name automatically in transform definitions#2973
Set pipeline name automatically in transform definitions#2973mrodm merged 13 commits intoelastic:mainfrom
Conversation
Allow to set an IngestPipelineName function in these templates in order to set the right ingest template filename. This filename must contain as a prefix the package version (from the manifest).
|
/test |
6 similar comments
|
/test |
|
/test |
|
/test |
|
/test |
|
/test |
|
/test |
jsoriano
left a comment
There was a problem hiding this comment.
LGTM, some nitpicking and wondering if we should use text/template instead of html/template.
| _, err = os.Stat(filepath.Join(packageRootPath, "elasticsearch", "ingest_pipeline", pipelineFileName)) | ||
| if err != nil { | ||
| return nil, TransformDefinition{}, fmt.Errorf("destination ingest pipeline file %s not found: %w", pipelineFileName, err) | ||
| } |
There was a problem hiding this comment.
Did you test with integrations? Maybe some package fails with this 😬 Though if it does we should fix the package.
There was a problem hiding this comment.
Testing to build locally all packages in the integrations repository, there is one package that sets as dest.pipeline the ingest pipeline from the data stream:
dest:
index: "aws_billing.cur-v1"
pipeline: "metrics-aws_billing.cur-0.1.0-pipeline_extract_metadata"
aliases:
- alias: "aws_billing.cur_latest"
move_on_creation: trueThat fails with the current validation 🤔
There was a problem hiding this comment.
That package (aws_billing) is failing this validation because of:
- It's using a pipeline from a data stream
- It set an old version in the destination pipeline, it should be 0.2.0.
There was a problem hiding this comment.
That pipeline from cur data stream looks like it is not used:
https://github.com/elastic/integrations/blob/34df8c08914a678dc3310bd000e7908a1af42ef6/packages/aws_billing/data_stream/cur/elasticsearch/ingest_pipeline/pipeline_extract_metadata.yml
At least from the default pipeline there is no pipeline processor to trigger that other pipeline:
https://github.com/elastic/integrations/blob/34df8c08914a678dc3310bd000e7908a1af42ef6/packages/aws_billing/data_stream/cur/elasticsearch/ingest_pipeline/default.yml
Here I've added the support to detect if the ingest pipeline exists on any data stream:
4f2756e
Probably, it could be moved the pipeline in that package to be on the elasticsearch folder in the root, but in any case it could be interesting to keep the support to check if pipelines are located in data streams.
WDYT ?
There was a problem hiding this comment.
I thought that for those cases using ingest pipelines for data streams, the transform definition could use the function like this:
dest:
index: "aws_billing.cur-v1"
pipeline: "metrics-aws_billing.cur{{ ingestPipelineName "pipeline_extract_metadata"}}"
aliases:
- alias: "aws_billing.cur_latest"
move_on_creation: trueThere was a problem hiding this comment.
Currently, this would be the minimum change required for aws_billing package
elastic/integrations#15593 since the latest version of the package is 0.2.0.
Another option could be move the Ingest Pipeline to the elasticsearch folder from the root of the package, but I'm not totally sure if this could lead to some issue.
There was a problem hiding this comment.
Currently, this would be the minimum change required for
aws_billingpackage
elastic/integrations#15593 since the latest version of the package is0.2.0.
Yep, we can apply this change by now, but using a pipeline from a data stream in a transform sounds wrong.
We can also use the placeholder you proposed in the previous comment, right?
pipeline: "metrics-aws_billing.cur{{ ingestPipelineName "pipeline_extract_metadata"}}"There was a problem hiding this comment.
We can also use the placeholder you proposed in the previous comment, right?
pipeline: "metrics-aws_billing.cur{{ ingestPipelineName "pipeline_extract_metadata"}}"
Exactly, but that placeholder could be used there once the elastic-package version containing this feature is available in the integrations repository.
Co-authored-by: Jaime Soriano Pastor <jaime.soriano@elastic.co>
|
test integrations |
|
Created or updated PR in integrations repository to test this version. Check elastic/integrations#15594 |
💚 Build Succeeded
History
cc @mrodm |
…cally in transforms (#16175) ti_google_threat_intelligence: Set destination pipeline name automatically in transforms Due to an earlier limitation of elastic-package to unable to extract integration version into the transform, the destination pipeline needed to be set/updated manually. Currently the integration requires users to manually add destination pipelines into the transform during the package installation and upgrade. This presents significant inconvenience to users and is often error-prone. elastic-package#2973 [1] removes the dest pipeline limitation which now allows the pipeline names to be templated. This PR updates the transform definition to set the pipeline names using a template which renders integration version. Change involves updating all transforms with: dest: pipeline: '{{ ingestPipelineName "<pipeline-name>"}}' Also: - With this the transform's does not require any manual intervention, hence they are now auto-enabled. - README is updated accordingly. [1] elastic/elastic-package#2973
…s missing and set destination pipeline name automatically in transforms The foreach processor in the ingest pipeline now uses ignore_missing: true to handle cases where threat_connect.indicator.associated_groups.data but lacks the attributes field, preventing pipeline failures during data processing. elastic-package#2973 [1] removes the dest pipeline limitation which now allows the pipeline names to be templated. This PR updates the transform definition to set the pipeline names using a template which renders integration version. [1] elastic/elastic-package#2973
Fixes elastic/package-spec#833
This PR adds support to set automatically the pipeline name used in transforms adding the current package version as a prefix.
In order to achieve that
elasticsearch/transform/*/transform.ymlis managed byelastic-packageas a template file (Golang text template). And it is allowed to use a function in the template namedingestPipelineNamethat adds the current version of the package to the pipeline.Example:
This rendering of the template happens at different stages to ensure always is performed:
elastic-package buildelastic-package installelastic-package test systemAuthor's checklist
How to test this PR locally