Add follow_inodes flag to tail files by inodes, not by names#1660
Add follow_inodes flag to tail files by inodes, not by names#1660OleksiiDuzhyi wants to merge 6 commits into
Conversation
|
Thanks for the patch! |
| } | ||
| excluded = @exclude_path.map { |path| path = date.strftime(path); path.include?('*') ? Dir.glob(path) : path }.flatten.uniq | ||
| paths - excluded | ||
| excluded = @exclude_path.map {|path| path = date.strftime(path); path.include?('*') ? Dir.glob(path) : path}.flatten.uniq |
| excluded = @exclude_path.map {|path| path = date.strftime(path); path.include?('*') ? Dir.glob(path) : path}.flatten.uniq | ||
| to_return = paths - excluded | ||
| # filter out non existing files, so in case pattern is without '*' we don't do unnecessary work | ||
| to_return = to_return.select {|path| |
There was a problem hiding this comment.
(paths - excluded).to_return.select {|path| seems enough
| to_return = paths - excluded | ||
| # filter out non existing files, so in case pattern is without '*' we don't do unnecessary work | ||
| to_return = to_return.select {|path| | ||
| Pathname.new(path).exist? |
There was a problem hiding this comment.
FileTest#exist? is more lightweight because no object instantiation.
There was a problem hiding this comment.
Cool, thx, didn't know about that option
| to_return = to_return.select {|path| | ||
| Pathname.new(path).exist? | ||
| } | ||
| Hash[to_return.map {|path| |
There was a problem hiding this comment.
Long lines in Hash[] seems hard to read.
This map can be moved to above line. {}.select {}.map {}.
| end | ||
|
|
||
| def existence_path | ||
| Hash[@tails.keys.map {|path_ino| |
There was a problem hiding this comment.
With lots of files, e.g. 1000+ files, this map creates lots of object.
So avoiding map is better for the performance.
Of course, expand_paths's map is also same.
| end | ||
| end | ||
|
|
||
| class PathInodeTuple |
There was a problem hiding this comment.
I didn't check eql? and other method behaviour of struct object but can't use struct instead of class for this case?
There was a problem hiding this comment.
I've checked eql? of struct - it work as expected, will change to Struct
|
BTW, all most tests are failed. |
|
I can't figure why tests are failing, maybe you could give me a hand in this? In my test I do the following: cleanup_file code: But for some reason I can't open a file for writing after I've deleted it. Even though I was able to write this file few lines above. I got this exception:
I have no clue why I can't reopen file when I've deleted it. Should I remove file in a different way? |
|
Will this be merged anytime soon? |
|
@esselius in_tail plugin is widely used so we want to test this patch on several environment before merging. |
|
We decided to move away from fluentd for log collection, also relying on inodes - is not 100% reliable, inodes could be reused which could cause issues |
What the tool do you use? |
|
@repeatedly As input to Flume we wrote a program in C which does what I tried to implement in this PR - tracks inodes, keep state of inode, etc. |
|
@OleksiiDuzhyi Thanks! So inode issue is still remained but flume resolves CPU concern. |
|
@repeatedly |
|
@repeatedly Hello, I have this problem too, is there any plan to merge this patch into master? |
|
@repeatedly , is this patch still in consideration or does fluentd already has any such feature to address this issue ? |
Current patch is incomplete because using only |
Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.
Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.
Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.
Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.
Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.
Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.
Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.
Based on closed PR #1660 which is posted by OleksiiDuzhyi because of incomplete test cases. Co-authored-by: alexeyd <alexeyd@wix.com> Signed-off-by: Hiroshi Hatake <hatake@clear-code.com>
Based on closed PR #1660 which is posted by OleksiiDuzhyi because of incomplete test cases. Co-authored-by: alexeyd <alexeyd@wix.com> Signed-off-by: Hiroshi Hatake <hatake@clear-code.com>
Use case
We have a lot of servers with up to 300 application deployed on each server (mostly Java/Scala). Each application writes log files using log4j library. Each log file is rotated daily in the same directory, with up to 10 back up files possible. Events are fed to realtime processing system(critical to time delays up to 1 second) and to HBase. We want to replace current heavy log collector with lightweight solution.
Problem
We tried to use default tail plugin. Most basic configuration will look like this:
According to fluentd in_tail plugin documentation - every time file is rotated for such configuration, it will be treated by in_tail plugin as new one, and will be read from it's head, even though file inode has not changed, and in position file we have latest read position (but just for last line before rotation; during
rotate_waittime position stored in memory only - https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/in_tail.rb#L554)This doesn't suit our needs, because:
One of the possible solution proposed by @repeatedly in #1613 is to avoid '*' in file pattern discovery. This also doesn't suit us, because we'd like to reliably store events with no or minimum loss because they are very crucial for business analysis. Possible data loss scenario:
Proposed solution
Add ability to track inodes inside fluentd in_tail plugin to enable combination of '*' in
pathconfiguration with rotation inside same directory andread_from_headset to true. In proposed solution we addfollow_inodesflag (set tofalseby default to preserve current behavior). Whenfollow_inodesis set totrueand file rotation is detected - we do not reread file, instead we are checking if such inode is present inside pos file - if yes, we don't recreate TailWatcher. TailWatcher is closed afterrotate_waitperiod and file marked as unwatched inside position file afterlimit_recently_modifiedtime passed. Pos file is cleaned on startup as it was before, but also delete entries for non-existing files.Please review, and let me know what do you think of such approach?