Add drop and explicit tests to avoid duplicate ingest of elasticsearch logs#30440
Merged
matschaffer merged 8 commits intoelastic:mainfrom Feb 21, 2022
Merged
Conversation
This pipeline already contained a drop to avoid duplicate logging.
This was partially guarded against in testing due to the grok on `elasticsearch.slowlog` but probably better to explicitly drop and avoid duplicate logging.
test-audit-docker.log also contains a case but it was overlooked in the expected file until elastic#30164 added the appropriate drop statements.
Contributor
|
Pinging @elastic/stack-monitoring (Stack monitoring) |
Contributor
|
Pinging @elastic/integrations (Team:Integrations) |
Contributor
Contributor
Author
|
/test |
Contributor
Author
|
The docs failure seems to be unrelated: If the above test doesn't fix it, I'll merge main on Monday. |
Contributor
|
@elasticmachine run elasticsearch-ci/docs |
tetianakravchenko
approved these changes
Feb 17, 2022
Contributor
tetianakravchenko
left a comment
There was a problem hiding this comment.
Thank you for adding tests!
I've tried those pipeline adjustment as well as audit pipeline changes from #30164 on k8s environment - all works, no duplication
Contributor
|
This pull request is now in conflicts. Could you fix it? 🙏 |
mergify bot
pushed a commit
that referenced
this pull request
Feb 21, 2022
…h logs (#30440) * Ensure we drop server logs that show up in deprecation pipeline * Add note about deprecation dataset normalization * Add test for mixed es server logs This pipeline already contained a drop to avoid duplicate logging. * Ensure we drop server logs that show up in slowlog pipeline This was partially guarded against in testing due to the grok on `elasticsearch.slowlog` but probably better to explicitly drop and avoid duplicate logging. * Add "mixed" test for elasticsearch audit logs test-audit-docker.log also contains a case but it was overlooked in the expected file until #30164 added the appropriate drop statements. * Changelog entry * Remove duplicatd filebeat header (cherry picked from commit 7b67384)
mergify bot
pushed a commit
that referenced
this pull request
Feb 21, 2022
…h logs (#30440) * Ensure we drop server logs that show up in deprecation pipeline * Add note about deprecation dataset normalization * Add test for mixed es server logs This pipeline already contained a drop to avoid duplicate logging. * Ensure we drop server logs that show up in slowlog pipeline This was partially guarded against in testing due to the grok on `elasticsearch.slowlog` but probably better to explicitly drop and avoid duplicate logging. * Add "mixed" test for elasticsearch audit logs test-audit-docker.log also contains a case but it was overlooked in the expected file until #30164 added the appropriate drop statements. * Changelog entry * Remove duplicatd filebeat header (cherry picked from commit 7b67384)
v1v
added a commit
to v1v/beats
that referenced
this pull request
Feb 21, 2022
…nd-k8s-env * upstream/main: fix typos and improve sentences (elastic#30432) Add drop and explicit tests to avoid duplicate ingest of elasticsearch logs (elastic#30440) {,x-pack/}auditbeat: replace uses of github.com/pkg/errors with stdlib equivalents (elastic#30321) Spelling fix (elastic#30439) packetbeat/beater: make sure Npcap installation runs before interfaces are needed in all cases (elastic#30438) Add BC about Homebrew no longer being available in 8.0 (elastic#30419) Install gawk as a replacement for mawk in Docker containers. (elastic#30452) Clean up python-related system tests (elastic#30415) Fix TestNewModuleRegistry flakiness (elastic#30453) [Filebeat] [auditd]: Support EXECVE events with truncated argument list (elastic#30382) Set `log.offset` to the start of the reported line in filestream (elastic#30445) clarify SelectedPackageTypes meaning and improve its usage (elastic#30142) [elasticsearch module] serialize shards properties (elastic#30408) Add docs about hints and templates autodiscovery priority (elastic#30343)
matschaffer
added a commit
that referenced
this pull request
Feb 21, 2022
… ingest of elasticsearch logs (#30487) Co-authored-by: Mat Schaffer <mat@elastic.co>
matschaffer
added a commit
that referenced
this pull request
Feb 22, 2022
…h logs (#30440) (#30488) * Ensure we drop server logs that show up in deprecation pipeline * Add note about deprecation dataset normalization * Add test for mixed es server logs This pipeline already contained a drop to avoid duplicate logging. * Ensure we drop server logs that show up in slowlog pipeline This was partially guarded against in testing due to the grok on `elasticsearch.slowlog` but probably better to explicitly drop and avoid duplicate logging. * Add "mixed" test for elasticsearch audit logs test-audit-docker.log also contains a case but it was overlooked in the expected file until #30164 added the appropriate drop statements. * Changelog entry * Remove duplicatd filebeat header (cherry picked from commit 7b67384) Co-authored-by: Mat Schaffer <mat@elastic.co>
v1v
added a commit
to v1v/beats
that referenced
this pull request
Feb 22, 2022
…ckaging-docker * upstream/main: (26 commits) Update docker/distribution to 2.8.0 (elastic#30462) Add `parsers` examples to `filestream` reference configuration (elastic#30529) extend documentation about setting orchestrator.cluster fields (elastic#30518) Forward-port 8.0.1 changelog to main (elastic#30522) Switch skip to use `CI` (elastic#30512) packetbeat/beater: don't attempt to install npcap when already installed (elastic#30509) Fix Docker module: rename fields on dashboards (elastic#30500) fix typos and improve sentences (elastic#30432) Add drop and explicit tests to avoid duplicate ingest of elasticsearch logs (elastic#30440) {,x-pack/}auditbeat: replace uses of github.com/pkg/errors with stdlib equivalents (elastic#30321) Spelling fix (elastic#30439) packetbeat/beater: make sure Npcap installation runs before interfaces are needed in all cases (elastic#30438) Add BC about Homebrew no longer being available in 8.0 (elastic#30419) Install gawk as a replacement for mawk in Docker containers. (elastic#30452) Clean up python-related system tests (elastic#30415) Fix TestNewModuleRegistry flakiness (elastic#30453) [Filebeat] [auditd]: Support EXECVE events with truncated argument list (elastic#30382) Set `log.offset` to the start of the reported line in filestream (elastic#30445) clarify SelectedPackageTypes meaning and improve its usage (elastic#30142) [elasticsearch module] serialize shards properties (elastic#30408) ...
v1v
added a commit
that referenced
this pull request
Mar 2, 2022
…-29710 * '8.1' of github.com:elastic/beats: (51 commits) refactor pushDockerImages (#30414) (#30624) ci: add windows-2022 in the extended meta-stage (#30528) (#30630) Curate k8s testing versions to only keep the actively maintained (#30619) (#30625) [8.1](backport #30355) Add Beats upgrade docs for 8.0 (#30612) Remove references to gcp from the Functionbeat docs (#30579) (#30609) x-pack/auditbeat/module/system/socket: defend against exec with zero arguments (#30586) (#30597) [MySQL Enterprise] Adding default paths values to manifest.yml (#30598) (#30604) metricbeat - fix elasticsearch and kibana integration tests failures in 8.0 (#30566) (#30594) Install gawk as a replacement for mawk in Docker containers. (#30452) (#30465) [Filebeat] Remove RecordedFuture dataset from Threat Intel module (#30564) (#30568) Adjust the documentation of `backoff` options in filestream input (#30552) (#30557) packetbeat/beater: help the GC clean up the Npcap installer if it's not used (#30513) (#30546) Osquerybeat: Add install verification for osquerybeat (#30388) (#30404) Update docker/distribution to 2.8.0 (#30462) (#30540) Add `parsers` examples to `filestream` reference configuration (#30529) (#30537) [8.1](backport #30068) ZooKeeper module: Adapt to ZooKeeper 3.6+ `mntr` response fields' changes. (#30360) [8.1](backport #30512) Switch skip to use `CI` (#30525) Forward-port 8.0.1 changelog to 8.1 (#30517) packetbeat/beater: don't attempt to install npcap when already installed (#30509) (#30511) Add drop and explicit tests to avoid duplicate ingest of elasticsearch logs (#30440) (#30488) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds a "drop" to the elasticsearch pipeline as well as explicit "mixed" test logs to ensure we won't attempt to ingest logs across mismatched pipelines.
Why is it important?
Without it, the elasticsearch slowlog pipeline will attempt to ingest all 5 file sets shipped by the elasticsearch filebeat module.
Additionally the test cases help guard against removal of the
dropprocessors.Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesCHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.How to test this PR locally
See https://www.elastic.co/guide/en/beats/devguide/master/filebeat-modules-devguide.html#_test for setup instructions. Specify
TESTING_FILEBEAT_MODULES=elasticsearchto test the elasticsearch module directly.Related issues
Fixes #30428
Related #30164
Related #16540
Use cases
In Elasticsearch log4j2.properties config is defined that server, deprecation, slowlog and audit logs are written to Console:
On the kubernetes node, where elasticsearch container is running, all those logs will be in one file:
/var/log/containers/__elasticsearch-.log
Filebeat pod has the whole folder
/var/logmounted and reads log files from/var/log/containers/.The elasticsearch module has 5 filesets which will lead to reading the kubernetes log 5 times and shipping every message to each pipeline.
This works around the issue of duplicate log storage by dropping at the top of the pipeline.