feat: export rcmgr metrics to prometheus#8785
Conversation
vyzo
left a comment
There was a problem hiding this comment.
looks fine to me, but I don't know how this prometheus business works in ipfs.
5a98324 to
d23a188
Compare
f9ff9d1 to
18b9d31
Compare
|
@marten-seemann : just checking my understanding, but it looks like one sharness test is not passing: https://app.circleci.com/pipelines/github/ipfs/go-ipfs/6284/workflows/1dc038d2-dd08-442e-93e5-30512e10193d/jobs/68339 I assume:
|
|
Yes, that test checks all exported metrics against a list, and fails if any is missing / not expected. I was planning to fix this later. Note: once libp2p gets a coherent metrics story (see libp2p/go-libp2p#1356), this test should probably be modified to exclude libp2p metrics. I'll leave that decision to the IPFS stewards though. |
33fbdbf to
26b8c42
Compare
|
@marten-seemann : please comment/ping when the test is updated/passing. Also, I assume we need to update so that this PR only shows the incremental diff on top of #8680 ? I'll make sure we then get reviewer eyes to land this. |
26b8c42 to
859e648
Compare
* update go-libp2p to v0.18.0
* initialize the resource manager
* add resource manager stats/limit commands
* load limit file when building resource manager
* log absent limit file
* write rcmgr to file when IPFS_DEBUG_RCMGR is set
* fix: mark swarm limit|stats as experimental
* feat(cfg): opt-in Swarm.ResourceMgr
This ensures we can safely test the resource manager without impacting
default behavior.
- Resource manager is disabled by default
- Default for Swarm.ResourceMgr.Enabled is false for now
- Swarm.ResourceMgr.Limits allows user to tweak limits per specific
scope in a way that is persisted across restarts
- 'ipfs swarm limit system' outputs human-readable json
- 'ipfs swarm limit system new-limits.json' sets new runtime limits
(but does not change Swarm.ResourceMgr.Limits in the config)
Conventions to make libp2p devs life easier:
- 'IPFS_RCMGR=1 ipfs daemon' overrides the config and enables resource manager
- 'limit.json' overrides implicit defaults from libp2p (if present)
* docs(config): small tweaks
* fix: skip libp2p.ResourceManager if disabled
This ensures 'ipfs swarm limit|stats' work only when enabled.
* fix: use NullResourceManager when disabled
This reverts commit b19f7c9.
after clarification feedback from
#8680 (comment)
* style: rename IPFS_RCMGR to LIBP2P_RCMGR
preexisting libp2p toggles use LIBP2P_ prefix
* test: Swarm.ResourceMgr
* fix: location of opt-in limit.json and rcmgr.json.gz
Places these files inside of IPFS_PATH
* Update docs/config.md
* feat: expose rcmgr metrics when enabled (#8785)
* add metrics for the resource manager
* export protocol and service name in Prometheus metrics
* fix: expose rcmgr metrics only when enabled
Co-authored-by: Marcin Rataj <lidel@lidel.org>
* refactor: rcmgr_metrics.go
* refactor: rcmgr_defaults.go
This file defines implicit limit defaults used when Swarm.ResourceMgr.Enabled
We keep vendored copy to ensure go-ipfs is not impacted when go-libp2p
decides to change defaults in any of the future releases.
* refactor: adjustedDefaultLimits
Cleans up the way we initialize defaults and adds a fix for case
when connection manager runs with high limits.
It also hides `Swarm.ResourceMgr.Limits` until we have a better
understanding what syntax makes sense.
* chore: cleanup after a review
* fix: restore go-ipld-prime v0.14.2
* fix: restore go-ds-flatfs v0.5.1
Co-authored-by: Lucas Molas <schomatis@gmail.com>
Co-authored-by: Marcin Rataj <lidel@lidel.org>
Part of #8761
Adds basic metrics under
libp2p_rcmgr_*Demo sample