Skip to content

Check if file has disappeared sooner in Log reader#13907

Merged
ycombinator merged 11 commits intoelastic:masterfrom
ycombinator:close_removed+close_renamed_before_eof
Oct 7, 2019
Merged

Check if file has disappeared sooner in Log reader#13907
ycombinator merged 11 commits intoelastic:masterfrom
ycombinator:close_removed+close_renamed_before_eof

Conversation

@ycombinator
Copy link
Copy Markdown
Contributor

@ycombinator ycombinator commented Oct 3, 2019

Resolves #13889.

This PR teach the Log reader to check file stat related changes, e.g. a file being moved or renamed, more aggressively than before. As a result a slow or flaky output can no longer hold up these checks from being performed.

@ycombinator ycombinator requested a review from urso October 3, 2019 20:29
@ycombinator ycombinator added bug Filebeat Filebeat review v6.8.4 v7.4.1 v7.5.0 v8.0.0 needs_backport PR is waiting to be backported to other branches. labels Oct 3, 2019
@urso urso self-assigned this Oct 4, 2019
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately github doesn't allow me to comment in code.
With this change every single read is preceded by an extra syscall, to check the file state. Maybe we can reduce the check to be only run if CloseRemoved or CloseRenamed is set.

The CloseInactive check MUST NOT be executed here. This function is run before the next line is read. If the harvester was blocked by the outputs for some amount of time, we might trigger an ErrInactive here, although the file was still written to. The CloseInactive check is supposed to trigger only after we did reach EOF.

The truncate case is somewhat tricky. I think checking truncate after EOF should be ok, as you should get a return of 0bytes read + EOF if you read on a truncated file. The check (and any synchronous check alike) is always subject to false-negatives, if the output was blocked for such a long time that the file was allowed to grow past offset.

By moving truncate and CloseInactive checks back to errorCheck, we don't need to execute the stat syscall for everyone line we want to read if CloseRemoved or CloseRenamed are not set.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding this detail to the godoc: errorChecks determines the cause for EOF errors, and how the EOF event should be handled based on the config options.

Copy link
Copy Markdown

@urso urso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments on checkFileErrors

@urso
Copy link
Copy Markdown

urso commented Oct 4, 2019

Potentially related flaky tests:
#7690
#10606
#12037

@ycombinator
Copy link
Copy Markdown
Contributor Author

@urso Thanks for the review. I've addressed your feedback and this PR is ready for another round of review, when you get a chance.

@ycombinator ycombinator changed the title Check file stat related changes sooner in Log reader Check if file has disappeared sooner in Log reader Oct 4, 2019
@ycombinator
Copy link
Copy Markdown
Contributor Author

Failing CI job is unrelated to this PR. Merging.

@ycombinator ycombinator merged commit a37e8be into elastic:master Oct 7, 2019
@ycombinator ycombinator removed the needs_backport PR is waiting to be backported to other branches. label Oct 7, 2019
ycombinator added a commit that referenced this pull request Oct 8, 2019
* Adding comment about EOF and removing redundant logic

* Adding debug logging when close_remove and close_renamed situations are reached

* Check file stats related errors before adding to buffer

* Reordering checks to be same as before

* Add CHANGELOG entry

* Fixing issue # in CHANGELOG entry

* Move CloseInactive and truncate checks back to being after EOF check

* Don't perform stat call for CloseRemoved/CloseRenamed unless those settings are enabled

* Better comments / function godoc

* Renaming method to be more descriptive

* Fixing up CHANGELOG
ycombinator added a commit that referenced this pull request Oct 11, 2019
…3960)

* Check if file has disappeared sooner in Log reader (#13907)

* Adding comment about EOF and removing redundant logic

* Adding debug logging when close_remove and close_renamed situations are reached

* Check file stats related errors before adding to buffer

* Reordering checks to be same as before

* Add CHANGELOG entry

* Fixing issue # in CHANGELOG entry

* Move CloseInactive and truncate checks back to being after EOF check

* Don't perform stat call for CloseRemoved/CloseRenamed unless those settings are enabled

* Better comments / function godoc

* Renaming method to be more descriptive

* Fixing up CHANGELOG

* Fix CHANGELOG

* Fixed compile error

* [WIP] Trying to check if file is removed

* Adding Removed() to Source interface

* Initialize procGetFileInformationByHandleEx

* Adding missed import
@ycombinator ycombinator deleted the close_removed+close_renamed_before_eof branch December 25, 2019 11:08
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…lastic#13959)

* Adding comment about EOF and removing redundant logic

* Adding debug logging when close_remove and close_renamed situations are reached

* Check file stats related errors before adding to buffer

* Reordering checks to be same as before

* Add CHANGELOG entry

* Fixing issue # in CHANGELOG entry

* Move CloseInactive and truncate checks back to being after EOF check

* Don't perform stat call for CloseRemoved/CloseRenamed unless those settings are enabled

* Better comments / function godoc

* Renaming method to be more descriptive

* Fixing up CHANGELOG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

close_removed and close_renamed only checked after EOF

2 participants