Skip to content

Currently running optimizers in Metrics#7316

Merged
JojiiOfficial merged 4 commits intodevfrom
metrics-currently-running-optimizers
Oct 23, 2025
Merged

Currently running optimizers in Metrics#7316
JojiiOfficial merged 4 commits intodevfrom
metrics-currently-running-optimizers

Conversation

@JojiiOfficial
Copy link
Contributor

@JojiiOfficial JojiiOfficial commented Sep 26, 2025

Supersedes #7275

Implements the optimizer_running_processes metrics as discussed:

# HELP optimizer_running_processes number of optimization processes running in total
# TYPE optimizer_running_processes gauge
optimizer_running_processes 1

@JojiiOfficial JojiiOfficial changed the base branch from master to dev September 26, 2025 08:07
@JojiiOfficial JojiiOfficial force-pushed the metrics-currently-running-optimizers branch from 098ce61 to 4c73996 Compare September 26, 2025 08:13
@qdrant qdrant deleted a comment from coderabbitai bot Sep 26, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
src/common/metrics.rs (2)

178-189: Type clarity: make the accumulator explicitly usize

Avoid relying on numeric literal inference; make the intent unambiguous.

-        let mut total_optimizations_running = 0;
+        let mut total_optimizations_running: usize = 0;

191-196: Prometheus naming: avoid “total” in a Gauge name

“_total” is reserved for monotonically increasing counters. This metric is a gauge, so prefer a name like optimizer_processes_running and adjust the help accordingly.

-        metrics.push(metric_family(
-            "optimizer_total_processes_running",
-            "number of optimization processes running in total",
+        metrics.push(metric_family(
+            "optimizer_processes_running",
+            "current number of optimization processes running",
             MetricType::GAUGE,
             vec![gauge(total_optimizations_running as f64, &[])],
         ));
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d7f8a3 and 4c73996.

📒 Files selected for processing (6)
  • lib/collection/src/shards/local_shard/telemetry.rs (1 hunks)
  • lib/collection/src/telemetry.rs (2 hunks)
  • lib/common/common/src/types.rs (2 hunks)
  • src/actix/api/service_api.rs (2 hunks)
  • src/common/metrics.rs (1 hunks)
  • src/common/telemetry_reporting.rs (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/collection/src/telemetry.rs
  • lib/common/common/src/types.rs
  • src/common/metrics.rs
  • src/actix/api/service_api.rs
  • lib/collection/src/shards/local_shard/telemetry.rs
  • src/common/telemetry_reporting.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/collection/src/telemetry.rs
  • lib/common/common/src/types.rs
  • src/common/metrics.rs
  • src/actix/api/service_api.rs
  • lib/collection/src/shards/local_shard/telemetry.rs
  • src/common/telemetry_reporting.rs
🧬 Code graph analysis (1)
src/common/metrics.rs (1)
src/actix/api/service_api.rs (1)
  • metrics (68-98)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: e2e-tests
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests
  • GitHub Check: lint
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consensus-compose
🔇 Additional comments (5)
lib/collection/src/shards/local_shard/telemetry.rs (1)

73-74: LGTM: gated optimizer logs via either Level4 or optimizer_logs flag

This enables metrics collection without bumping overall detail level. Nice.

src/actix/api/service_api.rs (2)

46-50: Telemetry endpoint: explicit optimizer_logs=false

Clear and consistent with privacy posture for standard telemetry.


82-86: Metrics endpoint: enable optimizer logs for aggregation

Correctly sets optimizer_logs=true at Level3 to support the new metric.

src/common/telemetry_reporting.rs (1)

12-16: Reporter DETAIL updated correctly

Keeping optimizer_logs=false here is appropriate.

lib/common/common/src/types.rs (1)

36-36: All TelemetryDetail initializers include optimizer_logs Verified constant DETAIL, Default impl, and all struct literals explicitly set optimizer_logs.

@qdrant qdrant deleted a comment from coderabbitai bot Sep 26, 2025
@JojiiOfficial JojiiOfficial force-pushed the metrics-currently-running-optimizers branch from 1ea9600 to 3977b28 Compare September 26, 2025 08:28
@qdrant qdrant deleted a comment from coderabbitai bot Sep 26, 2025
Comment on lines +56 to +68
/// Amount of optimizers currently running.
///
/// Note: A `DetailsLevel` of 4 or setting `telemetry_detail.optimizer_logs` to true is required.
/// Otherwise, this function will return 0, which may not be correct.
pub fn count_optimizers_running(&self) -> usize {
self.shards
.iter()
.flatten()
.filter_map(|replica_set| replica_set.local.as_ref())
.flat_map(|local_shard| local_shard.optimizations.log.iter().flatten())
.filter(|log| log.status == TrackerStatus::Optimizing)
.count()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When reading the PR description I thought this could be done with counting with some atomic number and a guard.

But I think your approach is better, where we use state that we already have. 👍

@JojiiOfficial JojiiOfficial force-pushed the metrics-currently-running-optimizers branch from df33643 to c4bfa71 Compare October 22, 2025 13:27
@qdrant qdrant deleted a comment from coderabbitai bot Oct 22, 2025
@JojiiOfficial JojiiOfficial merged commit f2c0127 into dev Oct 23, 2025
15 checks passed
@JojiiOfficial JojiiOfficial deleted the metrics-currently-running-optimizers branch October 23, 2025 08:46
timvisee pushed a commit that referenced this pull request Nov 14, 2025
* Currently running optimizer count in metrics

* Clearly state the prerequisites of count_optimizers_running()

* Minor improvements

* improve metric naming
@timvisee timvisee mentioned this pull request Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants