Fix UnindexedField infra to handle lookup&range index requirements#6496
Fix UnindexedField infra to handle lookup&range index requirements#6496
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
|
This looks hacky. It seems like we are missing one abstraction. We have:
Just because we have customizable integer index, I feel like we should have one which only captures functionality, following the internal enum FieldIndexType {
Int,
Datetime,
IntMap,
Keyword,
Float,
Geo,
FullText,
Bool,
Uuid,
UuidMap,
Null
}This way we should be able to select one of these when checking the filter, and then validate against schema if it is fulfilled or not. If it is not, it should be trivial to implement a conversion to a payload schema. |
369c2e2 to
a3443ac
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
lib/collection/src/problems/unindexed_field.rs (2)
413-427: Fix documentation comment forupdate_needs_range.The documentation comment for
update_needs_rangeincorrectly refers to "Match capability" instead of "Range capability".-/// Capture that an index with Match capability is required. +/// Capture that an index with Range capability is required. /// Does not toggle off if has been set once already. fn update_needs_range(&mut self, is_range: bool) {
415-419: Consider optimizing the conditional logic in update methods.Both helper methods have a conditional that checks if the flag is false before setting it based on the input. Since these methods are only called to potentially set the flag to true (never to set it back to false), you could simplify them:
fn update_needs_match(&mut self, is_match: bool) { - if !self.needs_match_index { - self.needs_match_index = is_match - } + self.needs_match_index |= is_match; } fn update_needs_range(&mut self, is_range: bool) { - if !self.needs_range_index { - self.needs_range_index = is_range - } + self.needs_range_index |= is_range; }This uses the bitwise OR assignment operator to set the flag to true if either the current value is true or
is_match/is_rangeis true, which achieves the same result more concisely.Also applies to: 424-427
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
lib/collection/src/problems/unindexed_field.rs(5 hunks)lib/segment/src/types.rs(1 hunks)tests/openapi/test_strictmode.py(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- tests/openapi/test_strictmode.py
- lib/segment/src/types.rs
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: integration-tests
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests-consensus
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: lint
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: storage-compat-test
🔇 Additional comments (4)
lib/collection/src/problems/unindexed_field.rs (4)
312-313: Good addition of capability tracking flags.These new boolean fields explicitly track whether the query needs indexes that support match or range operations, which aligns with the PR objective of capturing specific indexing requirements.
373-374: LGTM! Well-placed capability detection.You're correctly updating the match and range requirements based on condition predicates. This approach is more explicit and robust than trying to infer requirements solely from schema type comparisons.
432-439: Good implementation of capability-based index requirement detection.This is a key improvement that addresses the PR objective. The code now explicitly checks if the current index supports the required capabilities before falling back to comparing schema types.
440-440: Variable renaming improves code clarity.Renaming the variable to
index_info_kindmakes it clearer that this represents the type/kind of the index being evaluated.
|
Thanks for the feedback 👍 I agree that my solution is hacky, I will explore your idea to make the solution more generic. |
a3443ac to
af248be
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
lib/collection/src/problems/unindexed_field.rs (1)
578-605: Consider a more generic approach in a future refactoring.While this implementation addresses the immediate needs, consider exploring the abstraction mentioned in the PR comments for a future refactoring. A proper abstraction could make the code more maintainable and reduce the need for explicit conversions between types.
A future improvement could define a trait that represents index functionality and implements it for both
PayloadFieldSchemaandFieldIndexType, allowing for more flexible comparisons without explicit conversions. This would align with the suggestion in the PR comments about introducing a new abstraction focused solely on capturing the functionality of the index.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/collection/src/problems/unindexed_field.rs(11 hunks)
🔇 Additional comments (12)
lib/collection/src/problems/unindexed_field.rs (12)
11-11: Good addition of required imports.The new imports for
IntegerIndexParamsandEnumItertrait are necessary for the refactored code to handle parametrized integer indexes and iterate over the newFieldIndexTypeenum.Also applies to: 18-18
165-167: LGTM: Refactored to use new FieldIndexType enum.The function now returns an iterator of the new
FieldIndexTypeenum instead ofPayloadFieldSchema, which better represents specific index capabilities rather than complete schema definitions.
169-189: Improved index inference logic with explicit capability types.The function now correctly returns specific index capabilities (
FieldIndexTypevariants) based on the match value type, which creates a clearer separation between index functionality requirements and schema definitions.
191-211: Well-structured index type inference from AnyVariants.Similar to the previous function, this now correctly maps value types to their required index capabilities using the new
FieldIndexTypeenum, maintaining consistent behavior with the refactored design.
213-256: Comprehensive refactoring of index type inference from field conditions.This function properly maps various field conditions to their required index capabilities, taking into account different query types like matches, ranges, geo queries, etc. The changes align well with the overall refactoring approach.
349-392: Updated condition handling to use the new FieldIndexType.The function now properly collects required indexes as
FieldIndexTypevalues and converts them toPayloadFieldSchemaonly when updating the unindexed schema map, which maintains a cleaner separation of concerns.
395-408: Enhanced index capability checking.The
needs_indexmethod now correctly checks if the existing schema supports any of the required index capabilities, rather than making direct schema comparisons. This addresses the issue mentioned in the PR description about handling lookup and range requirements.
410-517: Updated expression parsing to use FieldIndexType.The update from expression method now uses the new
FieldIndexTypeenum, maintaining consistency with the refactored approach throughout the codebase.
520-533: Well-designed FieldIndexType enum.The new enum provides a focused representation of index capabilities, aligning with the PR's goal of explicitly capturing indexing requirements. This is a good step toward addressing the "missing abstraction" mentioned in the PR comments.
535-573: Key improvement: Properly handling parametrized integer indexes.The
schema_capacitiesfunction correctly maps fromPayloadFieldSchemato its supported capabilities, with special handling forIntegerIndexParamsthat checks both lookup and range flags. This directly addresses the core issue of handling lookup and range index requirements.
557-560: Good debug assertion for parameter validation.The debug assertion ensures that either lookup or range must be true for an Integer payload index, which helps catch potential configuration issues during development.
575-606: Well-implemented conversion from capability type to schema.The
From<FieldIndexType> for PayloadFieldSchemaimplementation correctly creates appropriate field schemas based on the required capability type, completing the two-way conversion between capabilities and schemas.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/openapi/test_strictmode.py(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: integration-tests
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: storage-compat-test
- GitHub Check: lint
🔇 Additional comments (3)
tests/openapi/test_strictmode.py (3)
165-165: Appropriate test renaming for clarity.The renaming of this test from
test_strict_mode_unindexed_filter_read_validationtotest_strict_mode_unindexed_filter_keyword_read_validationimproves clarity by specifying the field type being tested. This aligns well with the new test for integer fields.
220-303: Well-structured test for integer field indexing requirements.This new test properly validates strict mode behavior with integer fields, specifically testing the lookup capability requirements. The test covers all necessary scenarios:
- Verifying behavior without strict mode
- Testing with strict mode but allowing unindexed filtering
- Testing with strict mode and disabling unindexed filtering
- Creating an index with lookup capability and verifying success
- Removing lookup capability and verifying failure
The test appropriately verifies behavior after each index configuration change, addressing the previously raised concern.
1456-1475: Formula expression test extended to cover integer fields.The formula expression test has been appropriately extended to test integer fields with lookup capabilities. The test creates an index on
discount_pricewith the necessary lookup capability and verifies that the query succeeds after index creation. This aligns with the PR objective of handling lookup requirements for integer fields.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
lib/collection/src/problems/unindexed_field.rs (1)
608-623: Extend unit test coverage for integer index parametersThe current unit test only covers one case: when both lookup and range are true. Based on the past review comments, there are additional cases that should be tested:
- lookup = true, range = false => only IntMatch is inserted
- lookup = false, range = true => only IntRange is inserted
- lookup = false, range = false => triggers the debug assertion
#[cfg(test)] mod tests { use super::*; #[test] fn integer_index_capacities() { let params = PayloadSchemaParams::Integer(IntegerIndexParams { lookup: Some(true), range: Some(true), ..Default::default() }); let schema = PayloadFieldSchema::FieldParams(params); let index_types = schema_capacities(&schema); assert!(index_types.contains(&FieldIndexType::IntMatch)); assert!(index_types.contains(&FieldIndexType::IntRange)); } + + #[test] + fn integer_index_lookup_only() { + let params = PayloadSchemaParams::Integer(IntegerIndexParams { + lookup: Some(true), + range: Some(false), + ..Default::default() + }); + let schema = PayloadFieldSchema::FieldParams(params); + let index_types = schema_capacities(&schema); + assert!(index_types.contains(&FieldIndexType::IntMatch)); + assert!(!index_types.contains(&FieldIndexType::IntRange)); + } + + #[test] + fn integer_index_range_only() { + let params = PayloadSchemaParams::Integer(IntegerIndexParams { + lookup: Some(false), + range: Some(true), + ..Default::default() + }); + let schema = PayloadFieldSchema::FieldParams(params); + let index_types = schema_capacities(&schema); + assert!(!index_types.contains(&FieldIndexType::IntMatch)); + assert!(index_types.contains(&FieldIndexType::IntRange)); + } + + #[test] + #[should_panic(expected = "lookup or range must be true")] + fn integer_index_no_capabilities() { + let params = PayloadSchemaParams::Integer(IntegerIndexParams { + lookup: Some(false), + range: Some(false), + ..Default::default() + }); + let schema = PayloadFieldSchema::FieldParams(params); + let _index_types = schema_capacities(&schema); + } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/collection/src/problems/unindexed_field.rs(11 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: lint
- GitHub Check: test-consensus-compose
- GitHub Check: storage-compat-test
🔇 Additional comments (8)
lib/collection/src/problems/unindexed_field.rs (8)
520-533: Well-designed abstraction withFieldIndexTypeenumThe introduction of the
FieldIndexTypeenum is an excellent improvement that explicitly represents the different index capabilities. This addresses the core issue by separating index functionality from schema representation, making it easier to track specific index requirements like lookup and range capabilities.This aligns well with the suggestion from coszio's comment about introducing a focused abstraction for index functionality.
535-573: Clear mapping between schema and capabilitiesThe
schema_capacitiesfunction provides a clean mapping fromPayloadFieldSchemato the specific index capabilities it supports. The special handling for parametrized integer indexes is particularly important, as it correctly extracts whether the index supports lookup, range, or both capabilities.The debug assertion at lines 557-560 is a good safety check to ensure that at least one of lookup or range is enabled for integer indexes.
549-563: Comprehensive handling of integer index parametersThe implementation of integer index parameters handling is well-designed, checking for both lookup and range capabilities explicitly. This is the core of the fix for detecting missing indexes based on actual requirements.
Note that the implementation properly handles the feedback mentioned in past reviews where the suggested approach couldn't be used due to other fields like
is_principal&on_diskpreventing a complete equality test.
575-606: Clean conversion from index type to schemaThe
From<FieldIndexType> for PayloadFieldSchemaimplementation provides a clean way to convert from internal index types back to the corresponding schema representation. The implementation for integer indexes correctly creates appropriateIntegerIndexParamswith the specific capability enabled.
395-408: Improved index capability matchingThe updated
needs_indexmethod now performs a more granular capability check rather than a simple kind equality check. This allows for proper detection of missing indexes based on specific capabilities (lookup vs range).
169-256: Consistent index type inferenceThe changes to functions like
infer_index_from_match_value,infer_index_from_any_variants, andinfer_index_from_field_conditionare consistent with the new design, correctly returningVec<FieldIndexType>that explicitly represent required index capabilities.
349-392: Clear separation of concerns in extractionThe
update_from_conditionmethod now has a cleaner separation of concerns: it collects the required index types first, and only converts them to schema representations when adding them to theunindexed_schema. This makes the code more maintainable and easier to understand.
510-517: Consistent index handling in expressionsThe changes to
update_from_expressionmaintain consistency with the approach used inupdate_from_condition, ensuring that the handling of index requirements is uniform across different contexts.
|
@coszio can you please have a look and tell me if the approach is what you had in mind 🙏 |
| FieldIndexType::IntMatch => { | ||
| let params = IntegerIndexParams { | ||
| lookup: Some(true), | ||
| ..Default::default() | ||
| }; | ||
| PayloadFieldSchema::FieldParams(PayloadSchemaParams::Integer(params)) | ||
| } | ||
| FieldIndexType::IntRange => { | ||
| let params = IntegerIndexParams { | ||
| range: Some(true), | ||
| ..Default::default() | ||
| }; | ||
| PayloadFieldSchema::FieldParams(PayloadSchemaParams::Integer(params)) | ||
| } |
There was a problem hiding this comment.
I think we should treat the parametrized versions as advanced usage. Since we don't issue a recommendation for on_disk/is_tenant/etc settings, I'd say we can just convert to unparametrized type index here.
Moreover, this change would make it more obvious to convert to PayloadSchemaType instead.
Wdyt?
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
lib/collection/src/problems/unindexed_field.rs (1)
591-604: Consider handling more parameter combinationsYour implementation only provides conversions for single-capability integer indexes (either lookup or range). Consider adding a combined option that sets both capabilities when needed.
match val { FieldIndexType::IntMatch => { let params = IntegerIndexParams { lookup: Some(true), ..Default::default() }; PayloadFieldSchema::FieldParams(PayloadSchemaParams::Integer(params)) } FieldIndexType::IntRange => { let params = IntegerIndexParams { range: Some(true), ..Default::default() }; PayloadFieldSchema::FieldParams(PayloadSchemaParams::Integer(params)) } + // Consider adding this combined option if needed in the future + // FieldIndexType::IntBoth => { + // let params = IntegerIndexParams { + // lookup: Some(true), + // range: Some(true), + // ..Default::default() + // }; + // PayloadFieldSchema::FieldParams(PayloadSchemaParams::Integer(params)) + // } FieldIndexType::KeywordMatch => {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/collection/src/problems/unindexed_field.rs(11 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-low-resources
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: integration-tests-consensus
- GitHub Check: test-consistency
- GitHub Check: integration-tests
- GitHub Check: test-consensus-compose
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: storage-compat-test
- GitHub Check: lint
🔇 Additional comments (6)
lib/collection/src/problems/unindexed_field.rs (6)
524-537: Good abstraction with the newFieldIndexTypeenumCreating a dedicated enum for index capabilities aligns well with the PR objectives to explicitly capture indexing requirements. This makes the code more self-documenting and easier to maintain compared to the previous approach.
539-586: Robust schema capability mapping implementationThe
schema_capabilitiesfunction provides a clear mapping between schema types and their index capabilities. The special handling for parametrized integer indexes (lines 560-573) correctly implements the lookup and range requirements.Note that the debug assertion on line 567-570 ensures that integer indexes always have at least one capability enabled, which is a good safeguard.
588-619: Effective conversion fromFieldIndexTypetoPayloadFieldSchemaThe
Fromimplementation correctly maps each index type to the appropriate schema structure. For theIntMatchandIntRangecases, you've correctly set only the required parameter (lookup or range) to true and left other parameters at their defaults.
395-408: Improved index capability checkingThe updated
needs_indexmethod now correctly compares index capabilities more granularly instead of the previous simplistic kind equality check. This ensures that fields with specific index requirements (like integer fields with only lookup enabled) are properly detected when missing those capabilities.
591-604: Parametrized index handling aligns with PR objectivesThe explicit handling of parametrized integer indexes with separate lookups for
lookupandrangecapabilities directly addresses the PR objective of handling these requirements separately, as mentioned in the PR summary.
164-167: Improved iterator implementation with EnumIterUsing the
EnumItertrait fromstrumto iterate over all enum variants is a clean approach that reduces manual maintenance when adding new index types.
| PayloadSchemaParams::Integer(integer_index_params) => { | ||
| if integer_index_params.lookup == Some(true) { | ||
| index_types.insert(FieldIndexType::IntMatch); | ||
| } | ||
| if integer_index_params.range == Some(true) { | ||
| index_types.insert(FieldIndexType::IntRange); | ||
| } | ||
| debug_assert!( | ||
| !index_types.is_empty(), | ||
| "lookup or range must be true for Integer payload index" | ||
| ); | ||
| // unifying match arm types | ||
| true | ||
| } |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Consider handling the case when both lookup and range are false
Currently the debug assertion triggers when both lookup and range are false, but this could potentially happen in real-world scenarios. Consider how to handle this gracefully rather than relying on the debug assertion.
Run this script to check if there are any existing integer indexes with both lookup and range set to false:
🏁 Script executed:
#!/bin/bash
# Search for integer index parameters where both lookup and range might be false
# Look for places where IntegerIndexParams is created or used
rg "IntegerIndexParams" --type rust -A 5 -B 5Length of output: 24820
🏁 Script executed:
#!/bin/bash
# Extract the Validate impl for qdrant::IntegerIndexParams
rg -n "impl Validate for super::qdrant::IntegerIndexParams" -A 20 -B 2 lib/api/src/grpc/validate.rsLength of output: 753
``
🏁 Script executed:
#!/bin/bash
# Locate the definition of validate_integer_index_params
rg -n "validate_integer_index_params" -A 10 -B 5 lib/api/src/grpc/validate.rsLength of output: 1095
🏁 Script executed:
#!/bin/bash
# Locate the definition of validate_integer_index_params in segment data types
rg -n "validate_integer_index_params" -A 20 -B 5 lib/segment/src/data_types/index.rsLength of output: 975
Handle invalid IntegerIndexParams when neither lookup nor range is set
In release builds the debug_assert! is a no-op, and the code will silently proceed with an empty index_types if both lookup and range are unset (or None). We should explicitly catch “no capabilities enabled” and surface it as an error (or return false), rather than relying on debug_assert!.
Locations to update:
- In
lib/collection/src/problems/unindexed_field.rs(around lines 560–573), replace thedebug_assert!with explicit handling for the case where neitherlookup == Some(true)norrange == Some(true)—for example, returnfalseor emit anIssue. - (Optional) Align the shared validation in
segment::data_types::index::validate_integer_index_params(and the gRPCValidateimpl) to also treatNone/Noneas invalid, or document whyNonedefaults are acceptable.
🤖 Prompt for AI Agents
In lib/collection/src/problems/unindexed_field.rs around lines 560 to 573,
replace the debug_assert that checks if either lookup or range is true with
explicit handling for the case when both are false or None. Instead of
asserting, check if index_types is empty after evaluating lookup and range, and
if so, return false or emit an appropriate Issue to indicate invalid
IntegerIndexParams. This ensures the code gracefully handles invalid
configurations in release builds rather than silently proceeding.
afa5f13 to
092a3f9
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
lib/collection/src/problems/unindexed_field.rs (1)
539-586:⚠️ Potential issueReplace debug_assert with explicit error handling
The
debug_asserton lines 567-570 could be problematic in release builds, as assertions are stripped out during release compilation. For integer indexes, the case where neither lookup nor range is true should be handled explicitly.if integer_index_params.range == Some(true) { index_types.insert(FieldIndexType::IntRange); } -debug_assert!( - !index_types.is_empty(), - "lookup or range must be true for Integer payload index" -); +// Handle the case where neither lookup nor range is enabled +if index_types.is_empty() { + // Either return false or add a sensible default + log::warn!("Integer index with neither lookup nor range capability detected"); + // Default to providing at least one capability to avoid silent failure + index_types.insert(FieldIndexType::IntMatch); +} // unifying match arm types true
🧹 Nitpick comments (1)
lib/collection/src/problems/unindexed_field.rs (1)
588-619: Consider simplifyingFrom<FieldIndexType>to use unparametrized typesAs mentioned in a previous comment by coszio, it might be better to convert to unparametrized types for simplicity, especially since we don't make recommendations for other parameters like
on_diskoris_tenant.For integer indexes, consider using
PayloadFieldSchema::FieldType(PayloadSchemaType::Integer)instead of creating parametrized versions, or provide a clearer explanation in a comment about why parametrized versions are needed here.impl From<FieldIndexType> for PayloadFieldSchema { fn from(val: FieldIndexType) -> Self { match val { - FieldIndexType::IntMatch => { - let params = IntegerIndexParams { - lookup: Some(true), - ..Default::default() - }; - PayloadFieldSchema::FieldParams(PayloadSchemaParams::Integer(params)) - } - FieldIndexType::IntRange => { - let params = IntegerIndexParams { - range: Some(true), - ..Default::default() - }; - PayloadFieldSchema::FieldParams(PayloadSchemaParams::Integer(params)) - } + FieldIndexType::IntMatch | FieldIndexType::IntRange => { + // Using unparametrized type as we're just recommending the basic index type + PayloadFieldSchema::FieldType(PayloadSchemaType::Integer) + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
lib/collection/src/problems/unindexed_field.rs(11 hunks)tests/openapi/test_strictmode.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/openapi/test_strictmode.py
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-consensus-compose
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: integration-tests-consensus
- GitHub Check: test-consistency
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests
- GitHub Check: lint
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
🔇 Additional comments (4)
lib/collection/src/problems/unindexed_field.rs (4)
523-537: Good introduction ofFieldIndexTypeenumThe new
FieldIndexTypeenum clearly defines the distinct index capabilities, which is a good abstraction that separates type from functionality requirements. This aligns with the feedback received in the PR comments about creating a more structured representation of index functionality.
386-392: Good use of iterators for schema conversionThe process of converting from
FieldIndexTypetoPayloadFieldSchemaand extending the schema collection is well implemented using iterators, which is efficient and idiomatic Rust.
395-408: Improved index capability check logicThe
needs_indexmethod has been improved to check specific index capabilities rather than just the schema kind. This allows for more granular checking of whether an existing index supports the required operations.
169-256: Good refactoring of inference functionsThe index inference functions have been updated to return vectors of
FieldIndexTypewhich makes the code more explicit about the required index capabilities. This is a good improvement over the previous version which returnedPayloadFieldSchemadirectly.
| #[cfg(test)] | ||
| mod tests { | ||
| use super::*; | ||
| #[test] | ||
| fn integer_index_capacities() { | ||
| let params = PayloadSchemaParams::Integer(IntegerIndexParams { | ||
| lookup: Some(true), | ||
| range: Some(true), | ||
| ..Default::default() | ||
| }); | ||
| let schema = PayloadFieldSchema::FieldParams(params); | ||
| let index_types = schema_capabilities(&schema); | ||
| assert!(index_types.contains(&FieldIndexType::IntMatch)); | ||
| assert!(index_types.contains(&FieldIndexType::IntRange)); | ||
| } | ||
| } |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Expand test coverage for index capabilities
The current test only verifies that integer indexes with both lookup and range set to true map to both IntMatch and IntRange capabilities. Tests for other cases are missing, as mentioned in previous review comments.
Please add these additional test cases to thoroughly validate the integer index mapping logic:
#[test]
fn integer_index_lookup_only() {
let params = PayloadSchemaParams::Integer(IntegerIndexParams {
lookup: Some(true),
range: Some(false),
..Default::default()
});
let schema = PayloadFieldSchema::FieldParams(params);
let index_types = schema_capabilities(&schema);
assert!(index_types.contains(&FieldIndexType::IntMatch));
assert!(!index_types.contains(&FieldIndexType::IntRange));
}
#[test]
fn integer_index_range_only() {
let params = PayloadSchemaParams::Integer(IntegerIndexParams {
lookup: Some(false),
range: Some(true),
..Default::default()
});
let schema = PayloadFieldSchema::FieldParams(params);
let index_types = schema_capabilities(&schema);
assert!(!index_types.contains(&FieldIndexType::IntMatch));
assert!(index_types.contains(&FieldIndexType::IntRange));
}
#[test]
fn integer_index_none_enabled() {
// Test case for when neither lookup nor range is enabled
let params = PayloadSchemaParams::Integer(IntegerIndexParams {
lookup: Some(false),
range: Some(false),
..Default::default()
});
let schema = PayloadFieldSchema::FieldParams(params);
let index_types = schema_capabilities(&schema);
// This assertion will depend on how you handle the empty case
// after removing the debug_assert
}🏁 Script executed:
#!/bin/bash
# Let's check how comprehensive the current testing is for the functionality
# First, check if there are any other tests in the file
rg -l "fn.*test" lib/collection/src/problems/unindexed_field.rs
# Second, check if there are tests for other combinations of lookup/range flags
rg -A5 -B5 "lookup.*range|range.*lookup" --type rust lib/collection/src/problems/unindexed_field.rsLength of output: 610
Expand test coverage for integer index capabilities
The existing test only covers the case where both lookup and range are Some(true). We should add the following tests to ensure all flag combinations are validated:
• integer_index_lookup_only – only lookup = true
• integer_index_range_only – only range = true
• integer_index_none_enabled – both flags false (see handling of the empty‐set case)
Suggested additions in lib/collection/src/problems/unindexed_field.rs:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn integer_index_capacities() {
let params = PayloadSchemaParams::Integer(IntegerIndexParams {
lookup: Some(true),
range: Some(true),
..Default::default()
});
let schema = PayloadFieldSchema::FieldParams(params);
let index_types = schema_capabilities(&schema);
assert!(index_types.contains(&FieldIndexType::IntMatch));
assert!(index_types.contains(&FieldIndexType::IntRange));
}
#[test]
fn integer_index_lookup_only() {
let params = PayloadSchemaParams::Integer(IntegerIndexParams {
lookup: Some(true),
range: Some(false),
..Default::default()
});
let schema = PayloadFieldSchema::FieldParams(params);
let index_types = schema_capabilities(&schema);
assert!(index_types.contains(&FieldIndexType::IntMatch));
assert!(!index_types.contains(&FieldIndexType::IntRange));
}
#[test]
fn integer_index_range_only() {
let params = PayloadSchemaParams::Integer(IntegerIndexParams {
lookup: Some(false),
range: Some(true),
..Default::default()
});
let schema = PayloadFieldSchema::FieldParams(params);
let index_types = schema_capabilities(&schema);
assert!(!index_types.contains(&FieldIndexType::IntMatch));
assert!(index_types.contains(&FieldIndexType::IntRange));
}
#[test]
#[should_panic(expected = "lookup or range must be true for Integer payload index")]
fn integer_index_none_enabled() {
let params = PayloadSchemaParams::Integer(IntegerIndexParams {
lookup: Some(false),
range: Some(false),
..Default::default()
});
let schema = PayloadFieldSchema::FieldParams(params);
// debug_assert enforces at least one true; expecting a panic here
let _ = schema_capabilities(&schema);
}
}🤖 Prompt for AI Agents
In lib/collection/src/problems/unindexed_field.rs around lines 621 to 636, the
existing test only covers the case where both lookup and range are true for
integer index capabilities. Add three new tests: one where only lookup is true
and range is false, asserting IntMatch is present but IntRange is not; one where
only range is true and lookup is false, asserting IntRange is present but
IntMatch is not; and one where both lookup and range are false, which should
trigger a panic due to the debug_assert in schema_capabilities. Implement these
tests inside the #[cfg(test)] mod tests block to fully validate all combinations
of integer index flags.
092a3f9 to
9d3029c
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (3)
lib/collection/src/problems/unindexed_field.rs (3)
621-636: 🛠️ Refactor suggestionExpand test coverage for integer index parameters
The test for integer index capabilities is a good start, but it only covers the case where both
lookupandrangearetrue. As mentioned in previous review comments, you should add tests for other combinations: lookup-only, range-only, and neither enabled (which should trigger the debug assertion).#[test] fn integer_index_lookup_only() { let params = PayloadSchemaParams::Integer(IntegerIndexParams { lookup: Some(true), range: Some(false), ..Default::default() }); let schema = PayloadFieldSchema::FieldParams(params); let index_types = schema_capabilities(&schema); assert!(index_types.contains(&FieldIndexType::IntMatch)); assert!(!index_types.contains(&FieldIndexType::IntRange)); } #[test] fn integer_index_range_only() { let params = PayloadSchemaParams::Integer(IntegerIndexParams { lookup: Some(false), range: Some(true), ..Default::default() }); let schema = PayloadFieldSchema::FieldParams(params); let index_types = schema_capabilities(&schema); assert!(!index_types.contains(&FieldIndexType::IntMatch)); assert!(index_types.contains(&FieldIndexType::IntRange)); } #[test] #[should_panic(expected = "lookup or range must be true for Integer payload index")] fn integer_index_none_enabled() { let params = PayloadSchemaParams::Integer(IntegerIndexParams { lookup: Some(false), range: Some(false), ..Default::default() }); let schema = PayloadFieldSchema::FieldParams(params); let _ = schema_capabilities(&schema); }
560-573:⚠️ Potential issueHandle case when both lookup and range are false
The current implementation uses a debug assertion that is only enforced in debug builds. In release builds, this assertion is ignored, potentially allowing an invalid state where both lookup and range are false.
As noted in the previous review comment, you should replace this debug assertion with explicit error handling to ensure the code behaves correctly in all build configurations.
if integer_index_params.lookup == Some(true) { index_types.insert(FieldIndexType::IntMatch); } if integer_index_params.range == Some(true) { index_types.insert(FieldIndexType::IntRange); } - debug_assert!( - !index_types.is_empty(), - "lookup or range must be true for Integer payload index" - ); - // unifying match arm types - true + if index_types.is_empty() { + // Both lookup and range are false or None, which is an invalid configuration + log::warn!("Invalid integer index configuration: both lookup and range are disabled"); + false + } else { + true + }
539-586: 🛠️ Refactor suggestionStrong implementation of schema capability mapping
The
schema_capabilitiesfunction properly mapsPayloadFieldSchemato index capabilities, with good handling of parametrized integer indexes that separately track lookup and range capabilities.However, there's a potential issue with the debug assertion at lines 567-570. In release builds, this assertion would be compiled out, potentially allowing both
lookupandrangeto be false without any error handling.- debug_assert!( - !index_types.is_empty(), - "lookup or range must be true for Integer payload index" - ); - // unifying match arm types - true + if index_types.is_empty() { + // Handle the case where both lookup and range are false + log::warn!("Integer payload index with neither lookup nor range capability"); + // Return false to indicate no capabilities, or consider adding a default capability + false + } else { + true + }
🧹 Nitpick comments (1)
lib/collection/src/problems/unindexed_field.rs (1)
588-619: Implementation creates appropriate schema recommendationsThe
From<FieldIndexType>implementation properly creates schema representations for issue recommendations. For parameterizable types likeIntMatchandIntRange, it creates field parameters with appropriate options, and for basic types it uses field type schemas.A minor observation: both
UuidMatchandUuidRangeconvert to the samePayloadSchemaType::Uuid(lines 611-612). This appears to match the system design where UUID fields inherently support both matching and range operations, but it might be worth adding a comment to clarify this intentional behavior.FieldIndexType::KeywordMatch => { PayloadFieldSchema::FieldType(PayloadSchemaType::Keyword) } FieldIndexType::FloatRange => PayloadFieldSchema::FieldType(PayloadSchemaType::Float), FieldIndexType::Text => PayloadFieldSchema::FieldType(PayloadSchemaType::Text), FieldIndexType::BoolMatch => PayloadFieldSchema::FieldType(PayloadSchemaType::Bool), FieldIndexType::UuidMatch => PayloadFieldSchema::FieldType(PayloadSchemaType::Uuid), - FieldIndexType::UuidRange => PayloadFieldSchema::FieldType(PayloadSchemaType::Uuid), + FieldIndexType::UuidRange => { + // UUID fields inherently support both match and range operations + PayloadFieldSchema::FieldType(PayloadSchemaType::Uuid) + }, FieldIndexType::DatetimeRange => { PayloadFieldSchema::FieldType(PayloadSchemaType::Datetime) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
lib/collection/src/problems/unindexed_field.rs(11 hunks)tests/openapi/test_strictmode.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/openapi/test_strictmode.py
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: test-low-resources
- GitHub Check: test-consensus-compose
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-consistency
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: lint
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (3)
lib/collection/src/problems/unindexed_field.rs (3)
524-537: Well-designed enum for explicit capability representationThe introduction of
FieldIndexTypeis an excellent design choice that explicitly models index capabilities instead of relying on schema representations. This aligns well with coszio's suggestion to introduce an abstraction focused on the functionality of indexes, making the code more maintainable and easier to understand.
386-392: Simplified index generation and schema update logicThe updated approach cleanly separates index capability detection from schema generation. The code now first determines required capabilities (as
FieldIndexType), then converts them to appropriate schema recommendations only when updating the unindexed schema map. This is much clearer than the previous approach.
395-407: Improved granularity in index capability checksThe revised
needs_indexmethod now checks if the existing payload schema supports any of the requiredFieldIndexTypes by mapping the schema to its capabilities. This provides more granular control over index requirements matching and aligns with coszio's suggestion to focus on the functionality aspects of indexes.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (3)
lib/collection/src/problems/unindexed_field.rs (3)
610-628: 🛠️ Refactor suggestionExpand test coverage for integer index parameters
The current test only verifies that integer indexes with both lookup and range set to true map to both
IntMatchandIntRangecapabilities. Tests for other combinations are missing.Add tests for the following cases:
- lookup = true, range = false → only
FieldIndexType::IntMatchis inserted- lookup = false, range = true → only
FieldIndexType::IntRangeis inserted- lookup = false, range = false → should trigger the assertion or handle the empty case
#[test] fn integer_index_capacities() { let params = PayloadSchemaParams::Integer(IntegerIndexParams { lookup: Some(true), range: Some(true), ..Default::default() }); let schema = PayloadFieldSchema::FieldParams(params); let index_types = schema_capabilities(&schema); assert!(index_types.contains(&FieldIndexType::IntMatch)); assert!(index_types.contains(&FieldIndexType::IntRange)); } + +#[test] +fn integer_index_lookup_only() { + let params = PayloadSchemaParams::Integer(IntegerIndexParams { + lookup: Some(true), + range: Some(false), + ..Default::default() + }); + let schema = PayloadFieldSchema::FieldParams(params); + let index_types = schema_capabilities(&schema); + assert!(index_types.contains(&FieldIndexType::IntMatch)); + assert!(!index_types.contains(&FieldIndexType::IntRange)); +} + +#[test] +fn integer_index_range_only() { + let params = PayloadSchemaParams::Integer(IntegerIndexParams { + lookup: Some(false), + range: Some(true), + ..Default::default() + }); + let schema = PayloadFieldSchema::FieldParams(params); + let index_types = schema_capabilities(&schema); + assert!(!index_types.contains(&FieldIndexType::IntMatch)); + assert!(index_types.contains(&FieldIndexType::IntRange)); +}
564-578:⚠️ Potential issueHandle the case when both lookup and range are false in IntegerIndexParams
This is a response to the previous review comment about handling invalid IntegerIndexParams. As suggested, we should explicitly handle the case when both lookup and range are false (or None) rather than relying on the debug assertion.
Replace the debug assertion with explicit handling:
if integer_index_params.lookup == Some(true) { index_types.insert(FieldIndexType::IntMatch); } if integer_index_params.range == Some(true) { index_types.insert(FieldIndexType::IntRange); } -debug_assert!( - !index_types.is_empty(), - "lookup or range must be true for Integer payload index" -); +if index_types.is_empty() { + // Either return false to indicate this does not satisfy indexing requirements + // or insert a default capability + return false; +} // unifying match arm types true
544-591: 🛠️ Refactor suggestionAdd explicit error handling for invalid integer index parameters
Currently, the
debug_assert!on lines 572-575 checks that at least one oflookuporrangeis true for integer indexes. However, this assertion is skipped in release builds, potentially allowing invalid configurations through.Consider replacing the debug assertion with explicit error handling:
if integer_index_params.lookup == Some(true) { index_types.insert(FieldIndexType::IntMatch); } if integer_index_params.range == Some(true) { index_types.insert(FieldIndexType::IntRange); } -debug_assert!( - !index_types.is_empty(), - "lookup or range must be true for Integer payload index" -); +if index_types.is_empty() { + log::warn!("Neither lookup nor range is enabled for Integer payload index"); + // Provide a sensible default or return false to indicate invalid configuration + index_types.insert(FieldIndexType::IntRange); // Default to range capability +} // unifying match arm types true
🧹 Nitpick comments (1)
lib/collection/src/problems/unindexed_field.rs (1)
593-608: Consider preserving index capability information in the schema conversionThe current conversion from
FieldIndexTypetoPayloadSchemaTypeloses information about whether an integer index needs lookup or range capability. This might lead to generating suboptimal schema recommendations.You could consider extending the conversion to preserve capability information:
impl From<FieldIndexType> for PayloadSchemaType { fn from(val: FieldIndexType) -> Self { match val { - FieldIndexType::IntMatch => PayloadSchemaType::Integer, - FieldIndexType::IntRange => PayloadSchemaType::Integer, + FieldIndexType::IntMatch | FieldIndexType::IntRange => PayloadSchemaType::Integer, FieldIndexType::KeywordMatch => PayloadSchemaType::Keyword, FieldIndexType::FloatRange => PayloadSchemaType::Float, FieldIndexType::Text => PayloadSchemaType::Text, FieldIndexType::BoolMatch => PayloadSchemaType::Bool, - FieldIndexType::UuidMatch => PayloadSchemaType::Uuid, - FieldIndexType::UuidRange => PayloadSchemaType::Uuid, + FieldIndexType::UuidMatch | FieldIndexType::UuidRange => PayloadSchemaType::Uuid, FieldIndexType::DatetimeRange => PayloadSchemaType::Datetime, FieldIndexType::Geo => PayloadSchemaType::Geo, } } }Then, you could also consider implementing
From<FieldIndexType> for PayloadSchemaParamsto generate more specific schema recommendations:impl From<FieldIndexType> for PayloadSchemaParams { fn from(val: FieldIndexType) -> Self { match val { FieldIndexType::IntMatch => PayloadSchemaParams::Integer(IntegerIndexParams { lookup: Some(true), range: Some(false), ..Default::default() }), FieldIndexType::IntRange => PayloadSchemaParams::Integer(IntegerIndexParams { lookup: Some(false), range: Some(true), ..Default::default() }), // ... other conversions ... } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/collection/src/problems/unindexed_field.rs(11 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-consensus-compose
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-consistency
- GitHub Check: test-low-resources
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: integration-tests
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: lint
- GitHub Check: storage-compat-test
🔇 Additional comments (3)
lib/collection/src/problems/unindexed_field.rs (3)
529-542: Good introduction of theFieldIndexTypeenum!This enum clearly represents specific index capabilities, making it easier to explicitly track lookup and range requirements. This approach is more maintainable than using
PayloadFieldSchemavectors directly.
163-166: Good refactoring of theall_indexesfunction using enum iterationUsing
FieldIndexType::iter()makes the code more maintainable, as adding new index types to the enum will automatically include them in the returned iterator.
397-410:Details
❌ Incorrect review comment
Consider checking all required index types instead of any
The
needs_indexmethod checks if any of the required index types are supported by the schema. In cases where multiple capabilities are needed, it might be more accurate to check if all required index types are supported.
🏁 Script executed:
#!/bin/bash # Let's check if there are cases where multiple index types are required rg -A5 -B5 "required_index.push\(.*\).*required_index.push\(" --type rust lib/collection/src/problems/unindexed_field.rsLength of output: 122
I’d like to inspect how
required_indexesis built and whereneeds_indexis called. Let’s pull in those contexts:
🏁 Script executed:
#!/bin/bash # 1) Find all invocations of .needs_index() rg -n -C3 "\.needs_index" --type rust lib/collection # 2) Locate where `required_indexes` is initialized and populated rg -n -C5 "let required_indexes" --type rust lib/collection/src/problemsLength of output: 1284
I want to see how
required_indexis constructed to know if it can ever contain multiple entries:
🏁 Script executed:
#!/bin/bash # Find where `required_index` is defined and populated rg -n 'required_index' -C5 --type rust lib/collection/src/problems/unindexed_field.rsLength of output: 6069
.any()is correct for OR-based index requirementsThe
required_indexesslice isn’t a list of all capabilities you must have, but rather a set of acceptable index types (i.e. any one of them will satisfy the query). Throughoutinfer_index_from_*andall_indexes(), you’re assembling a list of alternatives, not conjunctive requirements. Using.any()here correctly skips adding an index as soon as you find one supported type.No changes needed.
Likely an incorrect or invalid review comment.
…6496) * Fix UnindexedField infra to handle lookup&range index requirements * more generic approach * better test * restore * small unit test for sanity * review nits * Do not recommend parametrized index --------- Co-authored-by: Luis Cossío <luis.cossio@outlook.com>
…6496) * Fix UnindexedField infra to handle lookup&range index requirements * more generic approach * better test * restore * small unit test for sanity * review nits * Do not recommend parametrized index --------- Co-authored-by: Luis Cossío <luis.cossio@outlook.com>
Follow up to #6363 (comment)
This PR fills a hole in the missing index detection where the
lookup&rangecapabilities were not taken into account.Also no fix required for
Formulabecause of #6494