[phrase match] Generics improvements in PostingList<V>#6597
Conversation
5c07860 to
c6595e2
Compare
a2be814 to
87a6706
Compare
87a6706 to
6f1afb1
Compare
330820e to
b6b495f
Compare
📝 WalkthroughWalkthroughThis change refactors the posting list module by unifying value handling under a new Possibly related PRs
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
lib/posting_list/src/value_handler.rs (1)
8-13: Well-designed PostingValue trait.The trait elegantly associates value types with their handlers, enabling the simplified API. The
Cloneconstraint is reasonable for value types used in posting lists.Consider adding a documentation example showing how users would implement this trait for custom types like
String:/// Example implementation for String: /// ```rust /// impl PostingValue for String { /// type Handler = UnsizedHandler<String>; /// } /// ```lib/posting_list/src/view.rs (1)
151-191: Method signatures correctly updated, but verbose type paths impact readability.All the size calculations and return types are correctly updated to use the associated types.
Consider adding type aliases at the module level to improve readability:
type SizedValue<V> = <V::Handler as ValueHandler>::Sized; type VarSizeData<V> = <V::Handler as ValueHandler>::VarSizeData;This would simplify signatures like:
-pub(crate) fn get_chunk_unchecked( - &self, - chunk_idx: usize, -) -> &PostingChunk<<V::Handler as ValueHandler>::Sized> { +pub(crate) fn get_chunk_unchecked( + &self, + chunk_idx: usize, +) -> &PostingChunk<SizedValue<V>> {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
lib/posting_list/src/builder.rs(4 hunks)lib/posting_list/src/iterator.rs(4 hunks)lib/posting_list/src/lib.rs(2 hunks)lib/posting_list/src/posting_list.rs(5 hunks)lib/posting_list/src/tests.rs(5 hunks)lib/posting_list/src/value_handler.rs(4 hunks)lib/posting_list/src/view.rs(7 hunks)lib/posting_list/src/visitor.rs(6 hunks)
🧰 Additional context used
🧠 Learnings (1)
lib/posting_list/src/iterator.rs (1)
Learnt from: coszio
PR: qdrant/qdrant#6565
File: lib/posting_list/src/iterator.rs:0-0
Timestamp: 2025-05-26T14:50:13.183Z
Learning: The `advance_until_greater_or_equal` method in `PostingIterator` in `lib/posting_list/src/iterator.rs` is intentionally designed with different semantics than `next()` and can output the same value multiple times. This is intended behavior for a search method that finds the first element with ID >= target_id.
🧬 Code Graph Analysis (1)
lib/posting_list/src/posting_list.rs (5)
lib/posting_list/src/view.rs (1)
visitor(64-66)lib/common/common/src/counter/conditioned_counter.rs (2)
never(22-24)new(13-19)lib/posting_list/src/iterator.rs (1)
new(16-22)lib/posting_list/src/builder.rs (1)
new(24-26)lib/posting_list/src/visitor.rs (1)
new(21-27)
⏰ Context from checks skipped due to timeout of 90000ms (13)
- GitHub Check: lint
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: storage-compat-test
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests
- GitHub Check: integration-tests-consensus
🔇 Additional comments (35)
lib/posting_list/src/iterator.rs (6)
6-6: LGTM: Clean import addition for PostingValue trait.The import aligns with the refactoring objective of using the PostingValue trait instead of ValueHandler.
9-13: LGTM: Proper generic parameter refactoring.The change from
H: ValueHandlertoV: PostingValuecorrectly implements the new trait abstraction. The struct fields are updated consistently with the new generic parameter.
15-22: LGTM: Implementation block correctly updated.The implementation signature and constructor method are properly updated to use the new
V: PostingValueconstraint.
30-33: LGTM: Return type correctly updated.The method signature properly uses
PostingElement<V>instead of the previous handler-based type, maintaining the same functionality while simplifying the type signature.
63-64: LGTM: Iterator trait implementation updated correctly.The trait implementation properly uses
PostingElement<V>as the associated type, maintaining consistency with the new PostingValue abstraction.
86-92: LGTM: Additional trait implementations updated consistently.Both
ExactSizeIteratorandFusedIteratorimplementations are properly updated to use the new generic parameter constraint.lib/posting_list/src/visitor.rs (5)
4-4: LGTM: Import updated for new trait abstraction.The import correctly includes PostingValue alongside ValueHandler, supporting the trait migration.
9-10: LGTM: Struct definition properly refactored.The generic parameter change from
H: ValueHandlertoV: PostingValueand corresponding field type update are correct and consistent.
20-21: LGTM: Implementation block updated correctly.The constructor and implementation signature properly use the new PostingValue constraint.
164-169: Excellent: Core architectural improvement implemented.The change from
H::get_valuetoV::Handler::get_valuedemonstrates the key benefit of this refactoring. Instead of explicitly passing handler types as generics, the PostingValue trait now associates value types with their appropriate handlers internally. This achieves the PR objective of simplifying type signatures while maintaining functionality.Also applies to: 178-183
193-195: LGTM: IntoIterator implementation updated consistently.The trait implementation properly uses the new PostingValue-based types for Item and IntoIter associated types.
lib/posting_list/src/tests.rs (6)
8-8: LGTM: Import updated for new trait abstraction.The imports correctly include PostingValue and related traits needed for the refactored API.
14-16: LGTM: TestString correctly implements PostingValue.The implementation properly associates TestString with UnsizedHandler as its handler type, demonstrating the new trait pattern where value types are directly associated with their handlers.
35-35: Excellent: API simplification demonstrated.The function call is now much cleaner, using just the value type
()instead of requiring explicit handler type parameters. This shows the user-facing benefit of the refactoring.
53-61: LGTM: Consistent API simplification across tests.All test calls are updated to use the simplified API without explicit handler types, demonstrating the improved usability mentioned in the PR objectives.
Also applies to: 67-67
108-111: Excellent: Function signature greatly simplified.The change from dual generic parameters
<H, V>to single<V: PostingValue>parameter dramatically simplifies the function signature while maintaining the same functionality. This is exactly the type of improvement highlighted in the PR objectives.
127-127: LGTM: Builder API call updated correctly.The change from
build_generictobuildreflects the unified API where the PostingValue trait eliminates the need for separate build methods for different handler types.lib/posting_list/src/lib.rs (3)
18-18: LGTM: Debug trait requirement is reasonable.Adding
std::fmt::Debugto theSizedValuetrait bound is a sensible addition that will improve debugging capabilities without breaking existing implementations.
33-33: Excellent: Type aliases dramatically simplified.The change from
PostingList<SizedHandler<()>>toPostingList<()>demonstrates the core benefit of this refactoring. The type signatures are much cleaner and more intuitive for users, achieving the PR objective of removing unnecessary wrapper types.Also applies to: 36-36
40-40: LGTM: Re-exports updated for new trait abstraction.The addition of
PostingValueto the re-exports properly exposes the new trait while maintaining access to the handler types for advanced use cases.lib/posting_list/src/posting_list.rs (4)
1-1: Good addition of the Borrow trait.The import of
std::borrow::Borrowis necessary for the new ownership model ofvar_size_data, allowing the owned type to be borrowed as a reference.
17-26: Clear and helpful documentation update.The updated documentation clearly explains the new simplified API and how to use
PostingList<V>with thePostingValuetrait. The references to the value handlers are particularly helpful for users.
28-35: Elegant refactoring using associated types.The change from generic parameter
H: ValueHandlertoV: PostingValuewith associated types successfully simplifies the API while maintaining type safety. The internal fields correctly useV::Handlerto access the appropriate handler types.
102-102: Correct use of borrow() for the new ownership model.The call to
borrow()onvar_size_datais necessary to convert from the owned type to the borrowed form expected byPostingListView::from_components.lib/posting_list/src/builder.rs (2)
36-39: Excellent unification of the build method.The refactoring successfully consolidates multiple build methods into a single
build()method using thePostingValuetrait constraint. This significantly simplifies the API and improves usability.
49-49: Clean abstraction using the handler's process_values method.The use of
V::Handler::process_valueselegantly delegates value processing to the appropriate handler determined by thePostingValueimplementation.lib/posting_list/src/value_handler.rs (2)
37-40: Clever use of ToOwned for VarSizeData type.The
VarSizeDataassociated type withToOwnedconstraint is an elegant solution that allows:
()for sized values (no variable data needed)[u8]for unsized values (owned asVec<u8>, borrowed as&[u8])This design provides excellent type safety and zero-cost abstraction for sized values.
16-24: Good default implementations for common types.Providing
PostingValueimplementations for(),u32, andu64covers the common use cases withSizedHandler.lib/posting_list/src/view.rs (7)
11-13: LGTM! Import changes align with the refactoring.The addition of
PostingValueimport is necessary for the new trait-based design.
16-24: Struct refactoring correctly implements the new trait-based design.The generic parameter change from handler to value type and the addition of
PhantomDataare well-implemented. The verbose type paths<V::Handler as ValueHandler>::Sizedare necessary given Rust's type system constraints when accessing associated types through traits.
26-32: Good generalization ofPostingListComponentsto support different data types.The addition of the second generic parameter
Dforvar_size_dataenables the memory optimization mentioned in the PR objectives, where unit types don't need a vector or slice.
34-52: Efficient handling of unit type with zero-sizevar_size_data.Setting
var_size_datato&()forIdsPostingListViewachieves the memory optimization goal mentioned in the PR objectives.
81-104: Return type correctly reflects the new generic structure.The explicit generic parameters in the return type properly expose the internal structure while maintaining type safety.
232-235: Binary search complexity measurement correctly updated.The size calculation properly uses the new type path while maintaining the logarithmic complexity measurement.
70-79:Details
✅ Verification successful
Correct adaptation of
to_ownedmethod for the new generic design.The change from
to_vec()toto_owned()on line 74 is appropriate sincevar_size_datais now a generic type.Ensure that the
ToOwnedtrait bound is properly enforced on<V::Handler as ValueHandler>::VarSizeData. Run the following script to verify:
🏁 Script executed:
#!/bin/bash # Description: Verify that VarSizeData associated type has ToOwned bound in the trait definition # Search for the ValueHandler trait definition and check for ToOwned bound ast-grep --pattern 'trait ValueHandler { $$$ type VarSizeData$_; $$$ }' # Also check if there's a separate bound defined elsewhere rg -A 5 "type VarSizeData.*ToOwned"Length of output: 691
Verified
ToOwnedbound onVarSizeData—no changes needed.The
ValueHandlertrait inlib/posting_list/src/value_handler.rsalready declares:type VarSizeData: ToOwned<Owned: Clone + Debug> + Debug + ?Sized;so the use of
to_owned()in theto_ownedmethod is fully supported and correctly constrained.
PostingList<V>PostingList<V>
* make generic posting list lighter when there is no var_size_data needed * simplify PostingList generics * fix test * self review
Builds on top of #6583
Initially, the usage of PostingList, needed to wrap the value over one generic, but this becomes annoying from the user side, and hurts readability. It is possible to avoid this at the cost of implementing two traits for the value type.
For example, for a
Stringit would be:And this way the type signature will be
PostingList<String>instead ofPostingList<UnsizedHandler<String>>Additionally, now PostingListView<()> and PostingList<()> are 16 and 24 bytes smaller by removing the need for a vec/slice for var-size data if not needed