Skip to content

Fix race condition leading to duplicate tailing#181

Merged
snorecone merged 6 commits intopapertrail:masterfrom
Originate:mb-fix-glob-race-condition
Oct 31, 2016
Merged

Fix race condition leading to duplicate tailing#181
snorecone merged 6 commits intopapertrail:masterfrom
Originate:mb-fix-glob-race-condition

Conversation

@Bowbaq
Copy link
Contributor

@Bowbaq Bowbaq commented Oct 21, 2016

Expected

If a file is matched by two (or more) globs, remote_syslog2 matches the first glob in the list, and ignores subsequent matches

Actual

Sometimes, due to a race condition, a log file ends up being tailed twice. I encountered this today, we're using 0.18. Here's a trace. Notice that the some files are forwarded twice (line 7 and 29 for example).

Fix

Adding the file to the registry synchronously seems to fix the problem. I've added a test (not necessarily a fan of hijacking the logs but ...). Running that test on the current master fails most of the time.

@snorecone
Copy link
Contributor

Hi @Bowbaq, great catch.

About the test that is hijacking logs, would you consider instead making the *WorkerRegistry an interface instead (that responds to Add, Exists, etc., and using a testRegistry{} to count how many times a files is added? It seems like it would be less brittle that way.

@Bowbaq
Copy link
Contributor Author

Bowbaq commented Oct 31, 2016

@snorecone converted the tests. The build was still failing due to a race condition, but it's a different one. This one is because this read is not protected and the value can be written to concurrently. The last commit should fix

@snorecone
Copy link
Contributor

@Bowbaq thanks, this is great.

Sorry about the race in the test, I'll probably want to revisit the shutdown code later so that mutexes aren't used for every write.

@snorecone snorecone merged commit f4963e9 into papertrail:master Oct 31, 2016
@Bowbaq
Copy link
Contributor Author

Bowbaq commented Oct 31, 2016

re: mutex I haven't benched it but my guess is, in the absence of contention, acquiring a read lock should be very cheap

@snorecone
Copy link
Contributor

Yeah, I agree. No need to dig into it unless there's a benchmark.

@Bowbaq
Copy link
Contributor Author

Bowbaq commented Oct 31, 2016

Do you know when the next release will be? We're being billed up double for our logs until we deploy this

@snorecone
Copy link
Contributor

@Bowbaq
Copy link
Contributor Author

Bowbaq commented Nov 1, 2016

Sweet, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants