LIBBEAT: Enhancement replace_string processor for replacing strings values of fields.#17342
Conversation
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
|
Pinging @elastic/integrations-services (Team:Services) |
|
This seems very similar to https://www.elastic.co/guide/en/elasticsearch/reference/master/gsub-processor.html, could you please use the same signature? I think having |
| ReplaceWith string `config:"replace_with"` | ||
| type replaceConfig struct { | ||
| Field string `config:"field"` | ||
| Pattern string `config:"pattern"` |
There was a problem hiding this comment.
@urso @exekias should https://github.com/elastic/beats/blob/master/libbeat/common/match/matchers.go be enhanced to do replacing strings and just use matcher.Match here?
using plain old regex could be slower as compared to the optimized one in libbeat.
There was a problem hiding this comment.
for Pattern in the config use Patter *regexp.Regexp. Config unpacking will automatically try to compile the regex and fail with error + setting name if this has failed.
matcher.Match is an optimization for some custom cases only, but falls back to regexp if the case becomes more 'complicated'. Given the optimizations we have in matcher, I only see the case for a constant string match being helpful (which would become a sub-string search, or in some cases string-prefix/suffix comparison).
The matcher package also replaces capturing-group-matches with non-capturing-groups (greatly reduces allocations). Having patterns and replacement like gsub, do we want to allow users to use capturing group in the replacement in the future? E.g.
pattern: 'some (?P<important>[a-zA-Z]) string'
replace: 'found: {{important}}'
For now I would not enhance the matcher package. Only if we figure this is indeed a common problem. When doing so we might have to remove some of the optimizations. In case we find we really need to optimize another type (e.g. matcher.Replacer) might give us better flexibility in applying the kind of optimizations we need for the use-case, while not un-optimizing matcher.Matcher.
|
The implementation itself LGTM. Please fix the code format. The intake CI job fails already. See: https://travis-ci.org/github/elastic/beats/jobs/670699207 Please add some reference documentation as well. See Thank you! |
…beats into processor_replace_string
| fields: | ||
| - field: "file.path" | ||
| pattern: "/run/containerd/io.containerd.runtime.v1.linux/k8s.io/${data.kubernetes.container.id}/rootfs/" | ||
| replacement: "/" |
There was a problem hiding this comment.
In the implementation patterns can not reference events contents. A more correct regexp would be /run/containerd/io.containerd.runtime.v1.linux/k8s.io/.+/rootfs/.
Maybe we can have a simpler example?
There was a problem hiding this comment.
@urso I have updated the documentation to a simpler example. Can you please help fix the CI issue? It keeps failing with this error.
Error: copy failed: cannot stat source file ../../vendor/github.com/elastic/beats/libbeat/common/file: stat ../../vendor/github.com/elastic/beats/libbeat/common/file: no such file or directory
Thanks
|
Do not import |
|
jenkins run the tests please |
1 similar comment
|
jenkins run the tests please |
|
jenkins run the tests please |
|
Appropriate tests have been run through on Travis. I triggered a (hopefully) final test run on Jenkins. |
…alues of fields. (elastic#17342) This PR is to add a replace processor. This processor takes in a field name, search string and replacement string. Searches field value for pattern and replaces it with replacement string. (cherry picked from commit 09fd4df)
…unbld * upstream/master: ci: comment PRs with the build status (elastic#17971) Add domain state metricset to kvm module (elastic#17673) [Agent] Allow CLI paths override (elastic#17781) Fix generated metricbeat so create-metricset works. (elastic#18020) LIBBEAT: Enhancement replace_string processor for replacing strings values of fields. (elastic#17342) Update stale references to _xpack to refer to _license instead (elastic#18030) Review dependency patterns collection in Jenkins (elastic#18004)
|
Thanks a lot @urso |
…sor for replacing strings values of fields. (#18047)
|
Changes have been backported and will be released in 7.8.0. Thank you for contributing. |
What does this PR do?
This PR is to add a replace processor. This processor takes in a field name, search string and replacement string. Searches field value for pattern and replaces it with replacement string.
Why is it important?
This PR will help remove extra strings or add additional string to values
How to test this PR locally
Added unit test cases.
Use cases
While using auditbeat we get full path to file inside the pod on Kubernetes
"/run/containerd/io.containerd.runtime.v1.linux/k8s.io/${data.kubernetes.container.id}/rootfs/etc/runit/runsvdir/default/mcelog/supervise/pid.new"
This PR helps trim the beginning part of the string to get
/etc/runit/runsvdir/default/mcelog/supervise/pid.new"
Using config below