Skip to content

Conversation

@hanabi1224
Copy link
Contributor

@hanabi1224 hanabi1224 commented Sep 15, 2025

Summary of changes

Changes introduced in this pull request:

  • fix an issue that some metrics get duplicated (re-registered) after GC

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • Bug Fixes
    • Metrics are now cleanly reinitialized when the daemon starts, improving the reliability and consistency of Prometheus metrics after restarts. This standardizes metrics setup across services at launch and helps ensure accurate reporting from the outset. No configuration changes or user action are required; behavior is automatic on start-up.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 15, 2025

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The start_services function in src/daemon/mod.rs now begins by resetting the default Prometheus metrics registry to Default::default() via crate::metrics::default_registry() before initializing services. No public APIs or other logic were changed.

Changes

Cohort / File(s) Summary of Changes
Daemon startup and metrics reset
src/daemon/mod.rs
Added a pre-initialization step in start_services to reset the default Prometheus metrics registry (assign Default::default() to crate::metrics::default_registry()) before service initialization.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Daemon as Daemon::start_services
    participant Metrics as Metrics Registry
    participant Services as Service Initializer

    Caller->>Daemon: start_services()
    Daemon->>Metrics: reset to Default::default()
    note right of Metrics: Clear previous metrics state
    Daemon->>Services: initialize services
    Services-->>Daemon: services started
    Daemon-->>Caller: return
Loading

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Suggested reviewers

  • LesnyRumcajs
  • sudo-shashank

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "fix: some metrics get duplicated after GC" succinctly and accurately describes the primary issue this PR addresses—the duplication of metrics after garbage collection—and aligns with the change that resets the metrics registry to prevent re-registration. It is concise, uses a conventional "fix:" prefix, and maps directly to the PR objectives. A reviewer scanning history can understand the main intent without inspecting the full diff.

Comment @coderabbitai help to get the list of available commands and usage tips.

@hanabi1224 hanabi1224 marked this pull request as ready for review September 15, 2025 14:39
@hanabi1224 hanabi1224 requested a review from a team as a code owner September 15, 2025 14:39
@hanabi1224 hanabi1224 requested review from LesnyRumcajs and akaladarshi and removed request for a team September 15, 2025 14:39
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/daemon/mod.rs (1)

536-538: Minor: prefer mem::take for clearer intent.

Slightly more idiomatic and avoids constructing a temporary explicitly.

-    // Cleanup the default prometheus metrics registry
-    *crate::metrics::default_registry() = Default::default();
+    // Cleanup the default prometheus metrics registry
+    let _ = std::mem::take(&mut *crate::metrics::default_registry());
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b61c18 and b7814c0.

📒 Files selected for processing (1)
  • src/daemon/mod.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: hanabi1224
PR: ChainSafe/forest#5930
File: build.rs:64-77
Timestamp: 2025-08-13T09:43:20.301Z
Learning: hanabi1224 prefers hard compile-time errors in build scripts rather than runtime safeguards or collision detection, believing it's better to fail fast and fix root causes of issues like malformed snapshot names.
🧬 Code graph analysis (1)
src/daemon/mod.rs (1)
src/metrics/mod.rs (1)
  • default_registry (26-28)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build forest binaries on Linux AMD64
  • GitHub Check: tests
  • GitHub Check: tests-release
🔇 Additional comments (1)
src/daemon/mod.rs (1)

536-538: Good fix: reset registry before (re)registering metrics.

This prevents duplicate collector registration across service restarts triggered by GC. Placement is early enough to be contention-free and happens before any services spawn.

Comment on lines 536 to 538
// Cleanup the default prometheus metrics registry
*crate::metrics::default_registry() = Default::default();
let mut services = JoinSet::new();
Copy link
Contributor

@coderabbitai coderabbitai bot Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify no early/static registrations are lost.

If any metrics are registered in global/lazy statics before start_services runs, they’ll be dropped by this reset and never re-registered. Please confirm none exist (or re-register them after this reset).


🏁 Script executed:

#!/bin/bash
# Find all places that touch the default registry and potential static/lazy initializers.
rg -nP --type rust -C3 'default_registry\s*\('
rg -nP --type rust -n 'lazy_static!|OnceCell|OnceLock|static\s+ref'

Length of output: 10697


Do not reset the global/default Prometheus registry — it will drop statically-registered metrics.

Search found many modules that register into crate::metrics::default_registry() (examples: src/metrics/mod.rs, src/chain_sync/metrics.rs, src/libp2p/metrics.rs, src/libp2p/service.rs, src/libp2p/behaviour.rs, src/message_pool/msgpool/metrics.rs, src/utils/cache/lru.rs, plus a collector registered earlier in src/daemon/mod.rs). The reset at src/daemon/mod.rs:536–537 will remove those registrations. Remove or move the reset, or re-register/access those statics after the reset (e.g., call their init/touch functions) so metrics are not lost.

🤖 Prompt for AI Agents
In src/daemon/mod.rs around lines 536–538, the code resets the global Prometheus
registry which will drop statically-registered metrics; remove the global reset
(*crate::metrics::default_registry() = Default::default();) or move it only to a
place where you will immediately re-register all static collectors. Fix by
either deleting those two lines so existing metrics remain registered, or if a
reset is required, call the init/touch functions for every module that registers
metrics (e.g., metrics::init or the specific init functions in
src/metrics/mod.rs, src/chain_sync/metrics.rs, src/libp2p/*,
src/message_pool/msgpool/metrics.rs, src/utils/cache/lru.rs and the earlier
collector registration) immediately after the reset to restore static
registrations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hanabi1224 is this comment valid ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Fixed by using a dedicated metrics registry for collectors

LesnyRumcajs
LesnyRumcajs previously approved these changes Sep 15, 2025
@hanabi1224 hanabi1224 disabled auto-merge September 16, 2025 06:14
@hanabi1224 hanabi1224 marked this pull request as draft September 16, 2025 06:43
@hanabi1224 hanabi1224 marked this pull request as ready for review September 16, 2025 11:38
LesnyRumcajs
LesnyRumcajs previously approved these changes Sep 16, 2025
@LesnyRumcajs LesnyRumcajs dismissed their stale review September 16, 2025 11:59

no green checkmark!

@hanabi1224
Copy link
Contributor Author

@LesnyRumcajs CI now passed with retry

@LesnyRumcajs LesnyRumcajs added this pull request to the merge queue Sep 16, 2025
Merged via the queue into main with commit 8fb20fc Sep 16, 2025
87 of 90 checks passed
@LesnyRumcajs LesnyRumcajs deleted the hm/fix-duplicate-metrics-after-gc branch September 16, 2025 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants