-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
fix(connections): add Prometheus labels for folders and devices (fixes #9519) #10074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks, @rasa for the review. I have incorporated your suggested changes into the latest rebased push. I wasn't really sure how to express the sense properly. I also updated the folder function name for symmetry and searchability. Also, thank you to @tomasz1986 for fixing the PR title to match conventions. I saw that a fixing commit should have such in the short message, but my fix spanned 2 commits, so I floundered. |
| info_gauge := prometheus.NewGaugeFunc( | ||
| prometheus.GaugeOpts{ | ||
| Namespace: "syncthing", | ||
| Name: "device_info_" + did, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally one wouldn't do this as separate metrics, but one metric with multiple label values, e.g.
syncthing_device_info{id="aaa",name="bbb"} 1
syncthing_device_info{id="ccc",name="ddd"} 1
etc, otherwise it will be very difficult to use in joins I think. Probably the easiest way to do this would be to have a custom collector that has the config in entirety and generates the metrics when requested to.
(Doing it by just setting a value to one here in the register method wouldn't account for removing outdated values when a device is renamed etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, cool, I will look into a custom collector.
Is it expected that most syncthing users will be using Grafana for observability monitoring? As I progress, I would like to base my ergonomic/usability testing on whatever is most likely to benefit actual users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you're asking really; I expect a very tiny minority of Syncthing users to use any form of observability. Of those that do, I think Grafana on either Prometheus or VictoriaMetrics etc is the absolutely most likely combination, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've given me enough on which to work. I want to recreate a "common" setup and confirm that my final code "works" ergonomically to expose the information intended.
Sorry for the delay in replying. We had a death in the family which has kept me busy throughout this week. I expect to be able to work on a custom collector early next week.
This includes the information most likely to make it possible for a human to correlate the ids to something meaningful. There were a couple of approaches possible, but adjusting `metrics.go` seems like the least invasive and likely to cause difficulty. A function on `folder` would require expanding the surface area of `prometheus` which feel worse than expanding syncthing's own `config` type.
This very much mirrors the approach in doing the folder_info gauge
- `human-readable` from `humane` - `registerFolderInfoGauge` from `registerInfoGauge`
* fix(protocol): avoid deadlock with concurrent connection start and close (syncthing#10140) * build: properly propagate build tags to Debian build (syncthing#10144) Previously all were ignored except noupgrade which was hard coded... * chore(protocol): don't start connection routines a second time (syncthing#10146) * chore(protocol): only allow enc. password changes on cluster config (syncthing#10145) In practice we already always call SetPassword and ClusterConfig together. However it's not just "sensible" to do that, it's required: If the passwords change, the remote device needs to know about that to check that the enc. setup is valid/consistent (e.g. tokens match, folder-type is appropriate, ...). And with the passwords set later, there's no point in adding them as part of creating a new connection. This is a "followup" (if one can call it that 4 years later :) ) to resp. fix for the following commit: 924b968 Co-authored-by: Jakob Borg <jakob@kastelo.net> * build: remove schedule from PR metadata job It shouldn't have touched non-PR issues, but it did * chore: add issue types to GitHub issue templates * feat(config): expose folder and device info as metrics (fixes syncthing#9519) (syncthing#10148) Tihs makes it easier to use metrics based on device and folder labels, names, and other attributes. Other metrics which are based on folder or device ID can be joined with these info metrics to enrich their label sets. ``` # HELP syncthing_config_device_info Provides additional information labels on devices # TYPE syncthing_config_device_info gauge syncthing_config_device_info{device="I6KAH76-66SLLLB-5PFXSOA-UFJCDZC-YAOMLEK-CP2GB32-BV5RQST-3PSROAU",introducer="false",name="s1",paused="false",untrusted="false"} 1 # HELP syncthing_config_folder_info Provides additional information labels on folders # TYPE syncthing_config_folder_info gauge syncthing_config_folder_info{folder="default",label="The default folder",path="s2",paused="false",type="sendreceive"} 1 ``` With this you can e.g. query for ``` syncthing_connections_active * on(device) group_left syncthing_config_device_info ``` Fixes syncthing#9519 Closes syncthing#10074 Closes syncthing#10147 * chore(gui, man, authors): update docs, translations, and contributors * build: use own script instead of svu We use a slightly different handling of features between prereleases. * build: include "v" prefix in version tags... * chore: copyright in next-version script * fix(syncthing): avoid writing panic log to nil fd (syncthing#10154) ### Purpose This change fixes a logical bug in the panic log writing where we could end up writing to a uninitialized file descriptor. On the very first iteration, `panicFd` is nil. We enter the if `panicFd == nil { … }` block, check for “panic:” or “fatal error:”, and if neither matches, we skip instantiating `panicFd` altogether. However, immediately after, still within `if panicFd == nil { … }`, we call `panicFd.WriteString("Panic at ...")`. But `panicFd` would in this case be `nil`, which will cause a run‐time panic. It's not clear to me why panicFd is only initialized if the lines start with "panic:" or "fatal error:" so I've left that logic untouched. With this change we at least avoid the risk of writing to a nil filedescriptor. ## Authorship Your name and email will be added automatically to the AUTHORS file based on the commit metadata. --------- Co-authored-by: Jakob Borg <jakob@kastelo.net> * fix(stupgrades): return latest stable & pre for each major * build: also create relaysrv and discosrv releases * docs: link to Docker image, APT, in release notes * build: more resilient pushes to releases * fix(watchaggregator): properly handle sub-second watch durations (fixes syncthing#9927) (syncthing#10179) I'll let Audrius words from the ticket explain this :) > I'm a bit lost, time.Duration is an int64, yet watcher delay is float, > anything sub 1s gets rounded down to 0, so you just end up going into an > infinite loop. syncthing#9927 (comment) * fix(protocol): slightly loosen/correct ownership comparison criteria (fixes syncthing#9879) (syncthing#10176) Only Require either matching UID & GID OR matching Names. If the 2 devices have a different Name => UID mapping, they can never be totaly equal. Therefore when syncing we try matching the Name and fall back to the UID. However when scanning for changes we currently require both the Name & UID to match. This leads to forever having out of sync files back and forth, or local additions when receive only. This patch does not change the sending behavoir. It only change what we decide is equal for exisiting files with mismapped Name => UID, The added testcases show the change: Test 1,5,6 are the same as current. Test 2,3 Are what change with this patch (from false to true). Test 4 is a subset of test 2 they is currently special cased as true, which does not chnage. Co-authored-by: Jakob Borg <jakob@kastelo.net> * build: import release workflow changes from main * build: fix detection of next rc version --------- Co-authored-by: Jakob Borg <jakob@kastelo.net> Co-authored-by: Simon Frei <freisim93@gmail.com> Co-authored-by: Syncthing Release Automation <release@syncthing.net> Co-authored-by: ardevd <ardevd@users.noreply.github.com> Co-authored-by: yparitcher <y@paritcher.com>
Purpose
This is intended to address concerns raised in #9519. The intent is to add an info gauge (as suggested by @calmh) to ease human cross-referencing of stats as stored by IDs and meaningful strings.
Testing
I have run the automated tests. I did not see where these tests exercise the metrics reporting in particular.
As such, I have not added any new tests. An interested party could use their favorite metrics explorer to find the new gauges and correlate them with other labeled metrics.
Documentation
This change will be visible to users in their metrics databases, but not within any interface of syncthing to my knowledge. I am happy to document the new metric gauges if there is a reasonable place to do so.
Additional Info
This is my first attempt to contribute to
syncthing. Process errors or omissions are somewhat expected. I am happy to iterate on this work and make such adjustments as help the project.