fix(inputs.mongodb): add an option to bypass connection errors on start#11629
fix(inputs.mongodb): add an option to bypass connection errors on start#11629sspaink merged 27 commits intoinfluxdata:masterfrom papey:fix/mongodb-unreachable-hosts
Conversation
Hipska
left a comment
There was a problem hiding this comment.
My comments about moving the connect to mongo from the Init to the Start method still counts.
Oh yes, sorry. Will fix. Edit: fixed |
|
Here is a refactor, I did not find an example of the |
|
This is the interface: input.go And these are some sample implementations: telegraf/plugins/inputs/sql/sql.go Lines 354 to 392 in b07e94b telegraf/plugins/inputs/mqtt_consumer/mqtt_consumer.go Lines 151 to 166 in b07e94b |
Sounds good to me but how do we want that ? In the current implementation on master if one of the Mongo servers becomes unreachable after init/start errors will be written in the logs. IMO, it's ok to attach the behavior you describe to the new option, WDYT ? |
|
Yeah, so I was thinking, if this is enabled, then first ping the device and then only try to gather all the metrics if it does respond ok? Does that sounds reasonable? |
|
LGTM, i will implement it asap ! |
|
There is just one behavior left we may want to discuss : When Sorry for the the review requests mess, my internet was lagging and Github just goes crazy for no reason |
Hipska
left a comment
There was a problem hiding this comment.
Do you mean this is the current default behaviour? I think that's good, they can change to retry when they don't want that. But when thinking about that, "retry" might give wrong impression. I would expect telegraf to retry in the same poll interval. Maybe the value for this option should be "skip"?
Also, I don't get the new logic in Gather when behaviour is "retry". So seems like it returns and skips collection for the server if there is no error? That sounds a bit strange.
Sounds good to me to work on that in a future PR. |
|
@Hipska I changed the wording from "retry" to "skip", I also find it more clear. Thanks. |
Sorry I didn't get your point, if behavior is skip, we ping, if ping fails, we drop fetch from the current server : if m.DisconnectedServersBehavior == "skip" {
if err := srv.ping(); err != nil {
return
}
} |
Nevermind, I was indeed looking wrong. Maybe we could add a debug log message with the content of the error before we return? |
|
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 🥳 This pull request decreases the Telegraf binary size by -1.04 % for linux amd64 (new size: 150.2 MB, nightly size 151.7 MB) 📦 Click here to get additional PR build artifactsArtifact URLs |
The start method handler did not match the interface, nor was there a stop function. As a result, start was never called and the plugin was never setting up the servers to connect to and collect from correctly. This was introduced in influxdata#11629. fixes: influxdata#11830
Required for all PRs
resolves #10078
This PR replaces #10086.
Summary of changes :
ignore_unreachable_hoststo not return an error on init if a ping fails