[Filebeat] Prevent 'log' sending duplicate messages due to renaming#21911
Merged
kvch merged 1 commit intoelastic:masterfrom Oct 21, 2020
Merged
Conversation
Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file.
Contributor
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
1 similar comment
Contributor
|
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
|
💚 CLA has been signed |
Contributor
Contributor
|
Pinging @elastic/integrations-services (Team:Services) |
Contributor
|
jenkins run tests |
kvch
approved these changes
Oct 21, 2020
v1v
added a commit
to v1v/beats
that referenced
this pull request
Oct 21, 2020
…ter-commit * upstream/master: [Ingest Manager] Use ML_SYSTEM to detect if agent is running as a service (elastic#21884) Prevent log input from sending duplicate messages due to file renaming (elastic#21911)
v1v
added a commit
to v1v/beats
that referenced
this pull request
Oct 22, 2020
* upstream/master: (27 commits) chore: use ubuntu 18 as linux agent (elastic#22084) docs: Prepare Changelog for 7.9.3 (elastic#22073) (elastic#22075) docs: Prepare Changelog for 6.8.13 (elastic#22072) (elastic#22079) [build][packaging] Add resilience when docker build (elastic#22050) Fix the url of reviewdog (elastic#21981) revert WSS process reporting for windows (elastic#22055) Fix typo (elastic#19585) (elastic#22061) [Ingest Manager] Use ML_SYSTEM to detect if agent is running as a service (elastic#21884) Prevent log input from sending duplicate messages due to file renaming (elastic#21911) feat: package aliases for snapshots (elastic#21960) [DOC] Add firewall as possible troubleshooting issue (elastic#21743) [Filebeat] Add max_number_of_messages config parameter for S3 input (elastic#21993) [Elastic Agent] Fix missing elastic_agent event data (elastic#21994) Document auditbeat system process module config (elastic#21766) Update links (elastic#22012) dynamically find librpm (elastic#21936) Fix Istio docs (elastic#22019) [beats-tester][packaging] store packages in another location (elastic#21903) [Kubernetes] Remove redundant dockersock volume mount (elastic#22009) [Ingest Manager] Always try snapshot repo for agent upgrade (elastic#21951) ...
6 tasks
jsoriano
pushed a commit
to jsoriano/beats
that referenced
this pull request
Jan 28, 2021
elastic#21911) Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file. (cherry picked from commit bb50d32)
6 tasks
jsoriano
pushed a commit
to jsoriano/beats
that referenced
this pull request
Jan 28, 2021
elastic#21911) Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file. (cherry picked from commit bb50d32)
6 tasks
jsoriano
pushed a commit
to jsoriano/beats
that referenced
this pull request
Jan 28, 2021
elastic#21911) Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file. (cherry picked from commit bb50d32)
jsoriano
added a commit
that referenced
this pull request
Feb 1, 2021
#21911) (#23743) Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file. (cherry picked from commit bb50d32) Co-authored-by: William Deurwaarder <william.git@xs4all.nl>
jsoriano
added a commit
that referenced
this pull request
Feb 1, 2021
#21911) (#23744) Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file. (cherry picked from commit bb50d32) Co-authored-by: William Deurwaarder <william.git@xs4all.nl>
jsoriano
added a commit
that referenced
this pull request
Feb 1, 2021
#21911) (#23745) Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file. (cherry picked from commit bb50d32) Co-authored-by: William Deurwaarder <william.git@xs4all.nl>
leweafan
pushed a commit
to leweafan/beats
that referenced
this pull request
Apr 28, 2023
elastic#21911) (elastic#23745) Input:Log: Reset TTL of registry state when a file is renamed. In some rare cases the registry state is marked for removal (TTL is set to 0) while the file is only renamed. Log detects the renaming of the file and updates the name of the file. As the file still exists it should also update the TTL of the renamed file. (cherry picked from commit 7d6c6fe) Co-authored-by: William Deurwaarder <william.git@xs4all.nl>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Input:Log: Reset TTL of registry state when a file is renamed. In some
rare cases the registry state is marked for removal (TTL is set to 0)
while the file is only renamed. Log detects the renaming of the file
and updates the name of the file. As the file still exists it should
also update the TTL of the renamed file.
filebeat.yml.txt
I tried to add label 'bug' but have no idea how to do that.
What does this PR do?
This PR prevents in some rare cases that messages are resent by filebeat:input:log. The scenarios are written in the 'why is it important'-section. It copies the TTL of the registry state of a renamed file from newState to oldState as is done with the name.
Why is it important?
In some rare cases a log-file is marked as removed (TTL is set 0) while in fact the file is renamed.
This issue has been noticed in two similar yet different scenarios:
Issue 1
In case renaming of the log-file happens after scanning but before removal-test, and the original file is not recreated yet, the registry state is marked for removal (TTL is set to 0) as the harvester is finished and the file pointed to by the source of the registry state (the original file) does not exist.
Issue 2
In this case the source of the registry state still points to the original file (updating the registry state has been postponed). In case the close-event of the harvester (per example inactive or timeout) happens after scanning but before removal-test, the registry state is marked for removal (TTL is set to 0) as the harvester is finished, and the file pointed by source of the registry state (original file) does not match the registry state itself (as the content points to the already renamed file).
In both cases will the renaming of the file be detected in the next 'run' and the source of the registry state will be updated. Unfortunately is TTL (= 0) not updated. This results in the deletion of the registry state. In the 'run' after that the renamed file will be detected again by filebeat and seen as a new file (no registry state exists) and all messages will be resent.
This can be fixed by (re)setting the TTL of the renamed file (next to the name of the source). Also from a technical point-of-view this makes sense as the file does exists as it is just found in the scan.
Checklist
CHANGELOG.next.asciidocorCHANGELOG-developer.next.asciidoc.How to test this PR locally
The easiest way to test this is adjusting the code and adding a sleep (per example 4 seconds) between scan and the removal-test (Run-method of input.go) as that increases the chance that this will happen tremendously. Let filebeat scan a directory where files are constantly created and rotated. In that way I can reproduce this in 100% of the cases. I have attached my configuration file used for this test.
Use cases
In some rare cases prevent filebeat:input:log to resend logs while files are renamed (rotated).