Skip to content

feat(multipath): add back basic metrics#3672

Merged
Frando merged 7 commits intofeat-multipathfrom
Frando/mp-metrics-basics
Dec 9, 2025
Merged

feat(multipath): add back basic metrics#3672
Frando merged 7 commits intofeat-multipathfrom
Frando/mp-metrics-basics

Conversation

@Frando
Copy link
Copy Markdown
Member

@Frando Frando commented Nov 17, 2025

Description

This adds back basic metrics to feat-multipath.

  • Refactor the conn_ metrics (they were all currently unused on feat-multipath) into something meaningful: Track the total number of opened and closed connections - their diff at any time is the number of currently-active conne
  • Adds transport_ip_paths_added, transport_ip_paths_removed, transport_relay_paths_added, transport_relay_paths_removed counters
  • Removes metrics that are unused and - IMO - not going to come back
  • Comments out metrics that are unused and likely to come back

Breaking Changes

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.
    • List all breaking changes in the above "Breaking Changes" section.
    • Open an issue or PR on any number0 repos that are affected by this breaking change. Give guidance on how the updates should be handled or do the actual updates themselves. The major ones are:

pub actor_tick_other: Counter,

/// Number of endpoints we have attempted to contact.
pub endpoints_contacted: Counter,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we ever could do this in a good way, because we don't keep lists of endpoint ids over restarts, so this was already a bad metric for a while - removed it.

/// Number of endpoints we have attempted to contact.
pub endpoints_contacted: Counter,
/// Number of endpoints we have managed to contact directly.
pub endpoints_contacted_directly: Counter,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, I don't think there's a good way to do it meaningfully.

/// Number of connections with a successful handshake.
pub connection_handshake_success: Counter,
/// Number of connections with a successful handshake that became direct.
pub connection_became_direct: Counter,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two are better handled by the new conn_ metrics above

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 17, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3672/docs/iroh/

Last updated: 2025-12-09T11:27:18Z

@Frando Frando linked an issue Nov 17, 2025 that may be closed by this pull request
@n0bot n0bot bot added this to iroh Nov 17, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Nov 17, 2025
@Frando Frando requested a review from flub November 18, 2025 09:59
@github-actions
Copy link
Copy Markdown

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 6516d6f

Comment on lines +1463 to +1469
metrics.num_conns_transport_ip_only.inc();
}
TransportSummary::RelayOnly => {
metrics.num_conns_transport_relay_only.inc();
}
TransportSummary::IpAndRelay => {
metrics.num_conns_transport_ip_and_relay.inc();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talked with @Arqu and we agreed to remove these 3 metrics and instead have these 4:

  • ip_transport_paths_added: counter
  • ip_transport_paths_removed: counter
  • relay_transport_paths_added: counter
  • relay_transport_paths_removed: counter

That way we can increment counters when paths are opened and closed. It doesn't give you full view into what connections have been doing, but neither does this, because typically nearly everything would have been num_conns_transport_ip_and_relay.

@Frando
Copy link
Copy Markdown
Member Author

Frando commented Dec 8, 2025

I rebased this branch, removed the TransportSummary metrics and added the paths counters. I named the counters to transport_paths_ip_added etc, so that they share a prefix and are easily correlated when having a sorted list of metrics - lmk if you prefer your naming @flub.

@Frando Frando requested a review from flub December 8, 2025 12:47
Copy link
Copy Markdown
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM though would appreciate fixing the nits.

Thanks for getting back to this!

&mut self,
addr: transports::Addr,
source: Source,
metrics: &MagicsockMetrics,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to clone MagicsockMetrics into here, it exists for the same duration as the actor anyway. But this is optional, as you wish.

metrics: &MagicsockMetrics,
) {
match addr {
transports::Addr::Ip(_) => metrics.transport_paths_ip_added.inc(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Names are pretty subjective, I think I'll argue for transport_{ip|relay}_paths_added. Reason being that paths_added is the counter unit and tranport_ip belongs together as ip is the qualifier to transports. Good call to put transports at the front though.

Now what I really want is labels so we can do paths_added{transport=ip} but hey.

// pub path_rtt_variance_ms: Histogram,
// /// Histogram of path quality scores (0.0-1.0).
// #[default(Histogram::new(vec![0.0, 0.3, 0.5, 0.7, 0.85, 0.95, 1.0]))]
// pub path_quality_score: Histogram,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we remove the commented out stuff? I think we can come up with this again if we do end up needing something like it (well, I'm hoping we'd come up with something better 😉 )

@Frando Frando merged commit faa2119 into feat-multipath Dec 9, 2025
16 of 28 checks passed
@github-project-automation github-project-automation bot moved this from 🏗 In progress to ✅ Done in iroh Dec 9, 2025
@Arqu Arqu deleted the Frando/mp-metrics-basics branch December 9, 2025 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

Multipath metrics

3 participants