Skip to content

Always run Telemetry on a blocking thread#7269

Merged
agourlay merged 3 commits intodevfrom
always-run-telemetry-blocking-pool
Sep 19, 2025
Merged

Always run Telemetry on a blocking thread#7269
agourlay merged 3 commits intodevfrom
always-run-telemetry-blocking-pool

Conversation

@agourlay
Copy link
Member

@agourlay agourlay commented Sep 18, 2025

The collection telemetry acquires several synchronous locks under the hood.

e.g:

For context, we did make an effort to encapsulate such usage in dedicated blocking tasks in the past:

But the number of locks is difficult to track and the call sites not necessarily trivial to migrate.

This PR proposes to tackle the issue higher in the stack and run the whole collection telemetry on a dedicated thread.

Hopefully this will be enough to not starve the Actix RT when used in the telemetry and the metrics endpoints.

@agourlay agourlay marked this pull request as ready for review September 18, 2025 11:00
coderabbitai[bot]

This comment was marked as resolved.

coderabbitai[bot]

This comment was marked as resolved.

@agourlay agourlay requested a review from timvisee September 18, 2025 13:27
coderabbitai[bot]

This comment was marked as resolved.

@qdrant qdrant deleted a comment from coderabbitai bot Sep 19, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The

&self.actix_telemetry_collector.lock(),
&self.tonic_telemetry_collector.lock(),

also peak my interest. Though I haven't seen these causing issues yet.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will definitely look for those in potential future traces.

@agourlay agourlay merged commit 8a31160 into dev Sep 19, 2025
16 checks passed
@agourlay agourlay deleted the always-run-telemetry-blocking-pool branch September 19, 2025 08:28
timvisee pushed a commit that referenced this pull request Sep 29, 2025
* Always run Telemetry on a blocking thread

* run Telemetry on general runtime instead of the blocking Actix one

* Polish test
@timvisee timvisee mentioned this pull request Sep 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants