Conversation
📝 Walkthrough## Walkthrough
The changes span multiple modules and test files. In the telemetry module, the `LocalShardTelemetry` struct's `segments` field was changed from a non-optional vector to an optional `Option<Vec<SegmentTelemetry>>` with a serde attribute to skip serialization if `None`. Correspondingly, code initializing or assigning this field was updated to use `None` when no segments are present instead of an empty vector, including in the `DummyShard` and `LocalShard` telemetry data methods. The OpenAPI schema for `LocalShardTelemetry` was updated to make the `segments` property nullable and no longer required. In the operation time statistics module, the `Add` trait implementation for `OperationDurationStatistics` was simplified by replacing explicit pattern matching with concise logic for merging optional fields, preserving original behavior. In the storage types module, the anonymization function used on the `peers` field of the `ClusterInfo` struct was changed to one designed for collections with u64 hashable keys. In the telemetry module, the `count_vectors` method was simplified to sum vector counts directly from shard telemetry. In the test suite, the telemetry endpoint test was refactored from a single test function into a parameterized test running over multiple `details_level` values, verifying the structure and content of the telemetry response with increasing detail and nested data checks. No public API signatures were modified except for the renaming and parameterization of the test function.
## Possibly related PRs
- qdrant/qdrant#6390: Introduces optional telemetry fields and nullable properties in `LocalShardTelemetry`, including telemetry detail level refinements and anonymization improvements for shard telemetry data.
## Suggested reviewers
- coszio
- timviseeTip ⚡💬 Agentic Chat (Pro Plan, General Availability)
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms (11)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
lib/collection/src/shards/telemetry.rs(1 hunks)lib/segment/src/common/operation_time_statistics.rs(2 hunks)lib/storage/src/types.rs(2 hunks)tests/openapi/test_service.py(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/openapi/test_service.py (1)
tests/openapi/helpers/helpers.py (1)
request_with_validation(39-93)
🪛 Ruff (0.8.2)
tests/openapi/test_service.py
65-65: Pointless comparison. Did you mean to assign a value? Otherwise, prepend assert or remove it.
(B015)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: integration-tests
- GitHub Check: lint
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: storage-compat-test
🔇 Additional comments (7)
lib/segment/src/common/operation_time_statistics.rs (2)
122-122: Improved pattern matching with more concise codeThe change simplifies the pattern matching by using
Option::ormethod instead of explicit pattern matching for None/Some cases. This makes the code more readable while maintaining the same behavior.
142-142: Consistent simplification of pattern matchingSimilar to the change above, this uses the more idiomatic
Option::ormethod to handle the option combination instead of explicit pattern matching. The behavior remains equivalent but the code is now more concise.lib/collection/src/shards/telemetry.rs (1)
61-61: Good optimization for telemetry serializationAdding
skip_serializing_if = "Vec::is_empty"will omit thesegmentsfield from serialization when it's empty, reducing the size of telemetry payloads. This aligns well with the parameterized tests that verify telemetry response structure at different detail levels.lib/storage/src/types.rs (2)
18-18: Updated import for more appropriate anonymization utilityChanged from
anonymize_collection_valuesto the more specificanonymize_collection_with_u64_hashable_key, which better aligns with anonymizing thepeersHashMap that's keyed byPeerId.
207-207: Using more appropriate anonymization function for PeerId-keyed HashMapThis change uses a more specific anonymization function that's designed for collections with
u64hashable keys, which is appropriate for thepeersHashMap keyed byPeerId.tests/openapi/test_service.py (2)
47-48: Good test improvement with parameterizationReplacing a single test with a parameterized test provides better coverage of the telemetry API across different detail levels. This is a good testing practice.
75-102: Well-structured test assertions for different detail levelsThe test now properly verifies the response structure at each level of detail, checking for presence of expected fields based on the
details_levelparameter. This aligns well with the behavior of skipping empty segments in serialization introduced in the telemetry module.
| /// Do NOT rely on this number unless you know what you are doing | ||
| #[serde(skip_serializing_if = "Option::is_none")] | ||
| pub num_vectors: Option<usize>, | ||
| #[serde(skip_serializing_if = "Vec::is_empty")] |
There was a problem hiding this comment.
Empty vectors will break OpenAPI validation - it should be Option<Vec<SegmentTelemetry>> instead
| pub peer_id: PeerId, | ||
| /// Peers composition of the cluster with main information | ||
| #[anonymize(with = anonymize_collection_values)] | ||
| #[anonymize(with = anonymize_collection_with_u64_hashable_key)] |
tests/openapi/test_service.py
Outdated
|
|
||
| collection = result['collections']['collections'][0] | ||
|
|
||
| if level == 1: | ||
| assert list(collection.keys()) == ['vectors', 'optimizers_status', 'params'] | ||
| elif level == 2: | ||
| assert list(collection.keys()) == ['id', 'init_time_ms', 'config'] | ||
| elif level >= 3: | ||
| assert list(collection.keys()) == ['id', 'init_time_ms', 'config', 'shards', 'transfers', 'resharding'] | ||
|
|
||
| if level >= 3: | ||
| shard = collection['shards'][0] | ||
| assert list(shard.keys()) == ['id', 'key', 'local', 'remote', 'replicate_states'] | ||
|
|
||
| local_shard = shard['local'] | ||
|
|
||
| if level == 3: | ||
| assert list(local_shard.keys()) == [ | ||
| 'variant_name', 'status', 'total_optimized_points', 'vectors_size_bytes', | ||
| 'payloads_size_bytes', 'num_points', 'num_vectors', 'optimizations', 'async_scorer' | ||
| ] | ||
| elif level > 3: | ||
| assert list(local_shard.keys()) == [ | ||
| 'variant_name', 'status', 'total_optimized_points', 'vectors_size_bytes', | ||
| 'payloads_size_bytes', 'num_points', 'num_vectors', 'segments', 'optimizations', 'async_scorer' | ||
| ] | ||
|
|
||
| if level >= 4: | ||
| segment = local_shard['segments'][0] | ||
| assert list(segment.keys()) == ['info', 'config', 'vector_index_searches', 'payload_field_indices'] |
There was a problem hiding this comment.
I am not sure we care that much about those levels. It makes this test kind of fragile. Also, .keys() ordering is not guaranteed
There was a problem hiding this comment.
Also, .keys() ordering is not guaranteed
It returns ordered_dict and Qdrant always returns them in same order. there's nothing flaky about this and we don't change it often.
I am not sure we care that much about those levels
yeah it's not critical piece of code. Better to have tests than not have them :)
There was a problem hiding this comment.
It returns ordered_dict and Qdrant always returns them in same order.
nothing really enforces that
There was a problem hiding this comment.
Using set to avoid order change issues now.
* Improve telemetry logic and test * Parametrize telemetry test * Consistency hash peeer ID across telemetry * clean test * Use Option in segments telemetry * updat openapi spec * Avoid test failure on change in order of params
Some improvements on top of #6390.
Also checked latency of telemetry requests with level 3 vs 10 in chaos testing.
Before: (level 10)
After: (level 3)
That's a 40% speed up in this case :)
All Submissions:
devbranch. Did you create your branch fromdev?