Skip to content

feat(iroh): proper span for endpoints#3988

Merged
Frando merged 4 commits intomainfrom
Frando/endpoint-spans
Mar 3, 2026
Merged

feat(iroh): proper span for endpoints#3988
Frando merged 4 commits intomainfrom
Frando/endpoint-spans

Conversation

@Frando
Copy link
Copy Markdown
Member

@Frando Frando commented Mar 2, 2026

Description

  • Adds a endpoint{id=id_short} info-level span when binding the endpoint
  • Use this span as parent span for RemoteStateActor tasks and the main socket actor
  • Alter a few logging statements here and there that stood out while taking a good look at debug logs for the echo example. There's still more work here of course.

Breaking Changes

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.
    • List all breaking changes in the above "Breaking Changes" section.
    • Open an issue or PR on any number0 repos that are affected by this breaking change. Give guidance on how the updates should be handled or do the actual updates themselves. The major ones are:

Probe::QadIpv4 | Probe::QadIpv6 => unreachable!("must not be used"),
}
};
debug!(?report, "probe finished");
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're logging "starting probe" above, so I think a finished log at same level makes sense, otherwise the span is kinda pointless.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm dubious on this one. Isn't the probe report already logged or something?

Copy link
Copy Markdown
Member Author

@Frando Frando Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We print the final report from all probes, but not the results of individual probes, this we so far only do for QAD probes if I'm not missing something. Changed it so that the "probe started" are trace logs and the "probe finished" ones remain debug.

{
Poll::Ready(Ok(())) => Poll::Ready(Ok(())),
Poll::Ready(Ok(())) => {
trace!(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we log received datagrams at trace level, but didn't log sent transmits at all. this adds a trace log to mirror the receive side.

}

impl<'a> Transmit<'a> {
pub(crate) fn datagram_count(&self) -> usize {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used in the new trace log below

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be pub(crate) though, does it?

@n0bot n0bot bot added this to iroh Mar 2, 2026
@github-project-automation github-project-automation bot moved this to 🚑 Needs Triage in iroh Mar 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 2, 2026

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3988/docs/iroh/

Last updated: 2026-03-03T09:22:52Z


impl Socket {
/// Creates a [`Socket`] listening.
pub(crate) async fn spawn(opts: Options) -> Result<EndpointInner, BindError> {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not strictly related to this PR, but Socket::spawn -> EndpointInner is weird, isn't it. moved it to impl EndpointInner

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I assume this code structure made more sense many refactors ago :)

@Frando Frando force-pushed the Frando/endpoint-spans branch from 1d72884 to 08ceaac Compare March 2, 2026 16:23
@Frando Frando marked this pull request as ready for review March 2, 2026 16:23
@Frando Frando requested review from dignifiedquire, flub and matheus23 and removed request for flub March 2, 2026 16:23
@Frando Frando force-pushed the Frando/endpoint-spans branch from c6eadfe to d0e10be Compare March 2, 2026 16:26
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 2, 2026

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 5a7aea7

Copy link
Copy Markdown
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for picking this up!

.instrument(Span::current())
.await?;
debug!(
id = %inner.static_config.tls_config.secret_key.public(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come id needs to be added here? Is this a different one than the one from the endpoint span you just added?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it's printed in full, not just the first 8 chars from fmt_short. I thought it would be good to somewhere log the full endpoint id, and this seemed like a fitting place.

Probe::QadIpv4 | Probe::QadIpv6 => unreachable!("must not be used"),
}
};
debug!(?report, "probe finished");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm dubious on this one. Isn't the probe report already logged or something?

///
/// [`Poll::Pending`]: std::task::Poll::Pending
#[instrument(skip_all)]
#[instrument(skip_all, parent = self.sock.span.clone())]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, i wonder if this should be on more of the Endpoint public API. Anyway, that'd be fore another PR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's the one that stood out when looking at the logs. We can add more once we spot them.


impl Socket {
/// Creates a [`Socket`] listening.
pub(crate) async fn spawn(opts: Options) -> Result<EndpointInner, BindError> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I assume this code structure made more sense many refactors ago :)

}

impl<'a> Transmit<'a> {
pub(crate) fn datagram_count(&self) -> usize {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be pub(crate) though, does it?

@Frando Frando enabled auto-merge March 3, 2026 09:24
@Frando Frando added this pull request to the merge queue Mar 3, 2026
Merged via the queue into main with commit e23f2f3 Mar 3, 2026
28 of 29 checks passed
@github-project-automation github-project-automation bot moved this from 🚑 Needs Triage to ✅ Done in iroh Mar 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅ Done

Development

Successfully merging this pull request may close these issues.

2 participants