vendor: github.com/fluent/fluent-logger-golang v1.9.0#43100
vendor: github.com/fluent/fluent-logger-golang v1.9.0#43100cpuguy83 merged 2 commits intomoby:masterfrom conorevans:conorevans/update-fluent
Conversation
thaJeztah
left a comment
There was a problem hiding this comment.
Thanks for contributing! I left some comments inline. I was also "coding along" while I was writing my comments; let me open a pull request against your fork with my suggestions
|
|
||
| asyncReconnectInterval := 0 | ||
| if cfg[asyncReconnectIntervalKey] != "" { | ||
| ari64, err := strconv.ParseUint(cfg[asyncReconnectIntervalKey], 10, strconv.IntSize) |
There was a problem hiding this comment.
Perhaps we should consider to parse the value as a time.Duration (time.ParseDuration()) to make it easier (and less error-prone) for the user to set the right interval. Having to specify the value in ms works "ok(ish)" if you want to, say, specify a second (1000 ms), but can become a bit awkward after that. It's also error-prone, for example, if a user wants to specify 1 second, but specifies 1 and ends up with 1 ms.
And looking at the code, I see we already use duration strings for fluentd-retry-wait, so it's a bit more consistent.
So I would suggest to;
- use
time.ParseDuration() - validate the no negative values are specified
- perhaps also a minimum / maximum (not sure what reasonable values are? between 100ms and 10s ? (open to suggestions)
Would be good to have a minimal unit test (I see there's currently no tests in this package, which is a bit unfortunate, but let's not continue that "tradition" 😅)
There was a problem hiding this comment.
Thanks for the suggestion -- that sounds good to me. I'll have a look at your suggested changes on my fork and merge them in.
Would be good to have a minimal unit test (I see there's currently no tests in this package, which is a bit unfortunate, but let's not continue that "tradition" 😅)
Yeah, I saw that and I guess lazily thought "ah, it's Christmas, time of tradition" 😉 -- I'll add some tests for this MR and probably cover the other test-cases in parseConfig while I'm at it 👍
There was a problem hiding this comment.
MR and probably cover the other test-cases in parseConfig while I'm at it
Thanks! Probably good to do that in a follow-up / separate PR to keep this PR "on topic"
| } | ||
|
|
||
| if f.AsyncReconnectInterval > 0 { | ||
| if time.Since(f.latestReconnectTime) > time.Duration(f.AsyncReconnectInterval)*time.Millisecond { |
There was a problem hiding this comment.
Didn't look too closely if it could be a problem, but wondering if the usage of f.latestReconnectTime also needs a lock here
(This is in vendored code, so if it needs fixing, needs to be done in the upstream project)
There was a problem hiding this comment.
I don't think it needs one, at least in the current code. The only place that calls connect (which assigns latestReconnectTime is this if loop (if AsyncReconnectInterval is enabled, otherwise it's the writeWithRetry below) -- concurrent calls to connect are not made). If you want to take a second look, I'm all ears. I'm definitely a less experienced Gopher than you, so perhaps I am missing something!
There was a problem hiding this comment.
concurrent calls to connect are not made
Ah, right, yes, in that case it won't be needed. I saw a mention of the lock in one of the functions, and wondered if things could be called concurrently.
I'm definitely a less experienced Gopher than you, so perhaps I am missing something!
I still make mistakes and miss things as anyone! 😂 . What I did learn over the years is that it usually doesn't hurt to leave a comment (even if wrong). I've been bitten by cases where I wasn't sure and "assumed" other participants on the PR would've noticed if it was wrong (because they are more experienced than I, so for sure they would see a mistake??) and it turned out there was an issue after all, but everyone missed it.
| return | ||
| } | ||
|
|
||
| if f.AsyncReconnectInterval > 0 { |
There was a problem hiding this comment.
Per my comment above;
So perhaps we should also have acceptable minimal and maximal intervals?
I'm actually wondering if the fluentd code itself should also define a minimum to prevent a close retry loop. Perhaps worth raising an issue for that and/or a PR to add at require a minimum value? 🤔
There was a problem hiding this comment.
I'm actually wondering if the fluentd code itself should also define a minimum to prevent a close retry loop. Perhaps worth raising an issue for that and/or a PR to add at require a minimum value?
I considered that when I submitted the PR there but thought I'd leave it to fluent maintainers to suggest values if they believed it necessary. I do also think that it can be left to downstream users to decide for themselves what those safe values are. I totally see it being appropriate in this project. I think 100ms/10s are fair min/maxes. We can go with that to start with and a second reviewer (I believe those are necessary for most/external changes) can perhaps suggest an alternative if they feel differently.
There was a problem hiding this comment.
I think 100ms/10s are fair min/maxes. We can go with that to start with
Agreed; we can always relax the range if users complain (and if there's valid reasons to have shorter or longer intervals)
and a second reviewer (I believe those are necessary for most/external changes) can perhaps suggest an alternative if they feel differently.
Yes, general rule is to have a minimum of two reviewers (irregardless of who opened the PR) to reduce the risk of things being missed. In some rare exceptions we may merge with one review (trusting the maintainer that merged to have outweighed the risks).
|
I opened https://github.com/conorevans/moby/pull/1 against your branch; you can merge those changes if they look good to you (please squash the commits after that, as I don't think they need to be in separate commits 😅) |
|
CI Failure is unrelated, but looks like this new test is flaky; (merged in #43043) |
Updates the fluent logger library to v1.9.0. The update includes the following commit: * [Add periodic reconnection functionality](fluent/fluent-logger-golang@1c05506) See https://github.com/fluent/fluent-logger-golang/compare/v1.8.0..v1.9.0 Signed-off-by: Conor Evans <coevans@tcd.ie>
| return config, errors.Errorf("invalid value for %s: value (%q) must be between %s and %s", | ||
| asyncReconnectIntervalKey, interval, minReconnectInterval, maxReconnectInterval) | ||
| } | ||
| asyncReconnectInterval = int(interval) |
There was a problem hiding this comment.
this randomly popped into my head @thaJeztah -- I don't think this would function as expected, since the interval is a time.Duration which will be in nanoseconds. The Fluent library expects an interval in milliseconds:
### AsyncReconnectInterval
When async is enabled, this option defines the interval (ms) at which the connection
to the fluentd-address is re-established. This option is useful if the address
may resolve to one or more IP addresses, e.g. a Consul service address.
int(interval.Milliseconds()) would fix this -- https://go.dev/play/p/TJfb0zQ7l3E
There was a problem hiding this comment.
Ohhh! Great catch. For some reason I was thinking time.Interval was in Milliseconds already 🤦 😊😬
Let me know if you want to update it, or want me to push 👍
There was a problem hiding this comment.
I figured out the rebasing 🙂 -- updated.
There was a problem hiding this comment.
At least.. it looks OK 😅 -- it doesn't have both of us as committers for the second commit, but includes a sign-off for for both of us. Was there an option I missed that would have included you as an author of the commit as well as in the message itself?
There was a problem hiding this comment.
Read up a bit more, I remembered the co-authored-by field, so I fixed that! Should be good to go again 🙂
* When async is enabled, this option defines the interval (ms) at which the connection to the fluentd-address is re-established. This option is useful if the address may resolve to one or more IP addresses, e.g. a Consul service address. While the change in #42979 resolves the issue where a Docker container can be stuck if the fluentd-address is unavailable, this functionality adds an additional benefit in that a new and healthy fluentd-address can be resolved, allowing logs to flow once again. This adds a `fluentd-async-reconnect-interval` log-opt for the fluentd logging driver. Signed-off-by: Sebastiaan van Stijn <github@gone.nl> Signed-off-by: Conor Evans <coevans@tcd.ie> Co-authored-by: Sebastiaan van Stijn <github@gone.nl> Co-authored-by: Conor Evans <coevans@tcd.ie>
|
@cpuguy83 ptal |
|
Hey @thaJeztah just wondering, is there an expected timeline for 21.xx release? |
- What I did
Updated fluent-logger-golang dependency to v1.9.0 and added support in the logger for the new
AsyncReconnectIntervalparameter.- How to verify it
Run a Docker container with the fluentd logging driver here and set the
fluentd-addresslog-opt to an address with a host that could resolve to multiple IPs, e.g.fluentd.service.consul:24224. Set thefluentd-async-reconnect-intervalto a value greater than 0 (value in ms). Observe that the container will periodically reconnect to thefluentd-address. You can force a reconnection by removing one of the fluent hosts from the service pool (or equivalent).- Description for the changelog
Support AsyncReconnectInterval parameter in fluentd logging driver. This option is useful if the address
may resolve to one or more IP addresses, e.g. a Consul service address.
- A picture of a cute animal (not mandatory but encouraged)