Conversation
| match self { | ||
| Expression::Constant(_) => Ok(()), | ||
| Expression::Variable(_) => Ok(()), | ||
| Expression::Condition(condition) => condition.validate(), |
There was a problem hiding this comment.
make sure the Expression use a valid Condition
| Query::Discover(discover) => discover.discover.validate(), | ||
| Query::Context(context) => context.context.validate(), | ||
| Query::Fusion(fusion) => fusion.fusion.validate(), | ||
| Query::Nearest(vector) => vector.validate(), |
There was a problem hiding this comment.
those lines make sure to use the validation at the struct level to make it fool proof when adding new fields
| | grpc::vector_input::Variant::Document(_) | ||
| | grpc::vector_input::Variant::Image(_) | ||
| | grpc::vector_input::Variant::Object(_) => Ok(()), | ||
| grpc::vector_input::Variant::Sparse(sparse_vector) => sparse_vector.validate(), |
There was a problem hiding this comment.
leverage existing sparse and multivector validation
📝 WalkthroughWalkthroughThis change introduces comprehensive validation support for various data structures used in both gRPC and REST APIs. It adds Suggested reviewers
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
lib/api/build.rs (1)
273-301: Consider deduplicating and auto-generating massive “empty-constraint” listsThe block adds ~30 more
("", "")entries whose only purpose is to attach#[validate(nested)].
While functionally correct, hand-maintaining hundreds of string paths is error-prone (easy to mistype or forget on proto evolution) and bloats the build script.Suggested direction (no immediate change required, but worth a backlog ticket):
- .validates(&[ - ("VectorInput.variant", ""), - ("RecommendInput.positive", ""), - … - ], &[]) + // Auto-collect every `(Message, field)` where `field` is a oneof or a nested message + // and apply `#[validate(nested)]` in a loop. + .validates(generate_nested_field_list()?, &[])where
generate_nested_field_list()uses a small introspection table (or a JSON file produced byprost-build) instead of hard-coded strings.This keeps the script readable, guarantees coverage, and eliminates future maintenance headaches.
lib/api/src/rest/validate.rs (1)
250-273: Expression validation duplicates gRPC logic – consider a shared helperThe implementation is correct but duplicates almost identical code in
grpc/validate.rs.
A small helper likefn validate_expression<E: Into<ValidationErrors>>(expr: &Expression)…(or a macro) could remove duplication and keep REST & gRPC layers in sync automatically.lib/api/src/rest/schema.rs (2)
517-523: Consider addingValidatederive to FormulaQuery.Unlike other query types,
FormulaQueryis missing theValidatederive macro despite having a custom validation implementation inlib/api/src/rest/validate.rs. Adding this derive would maintain consistency with how other query types are handled.-#[derive(Debug, Serialize, Deserialize, JsonSchema)] +#[derive(Debug, Serialize, Deserialize, JsonSchema, Validate)] pub struct FormulaQuery { pub formula: Expression, #[serde(default)] pub defaults: HashMap<String, Value>, }
789-801: Consider adding validation constraints for numeric parameters.The
scaleandmidpointfields inDecayParamsExpressionmight benefit from validation constraints, especially since the comment mentions thatscale"Must be a non-zero positive number."/// The scale factor of the decay, in terms of `x`. Defaults to 1.0. Must be a non-zero positive number. +#[validate(range(min = "f32::EPSILON", message = "scale must be positive and non-zero"))] pub scale: Option<f32>, /// The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`. +#[validate(range(min = 0.0, max = 1.0, message = "midpoint must be between 0.0 and 1.0"))] pub midpoint: Option<f32>,
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
lib/api/build.rs(2 hunks)lib/api/src/grpc/qdrant.rs(9 hunks)lib/api/src/grpc/validate.rs(1 hunks)lib/api/src/rest/schema.rs(3 hunks)lib/api/src/rest/validate.rs(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
lib/api/src/grpc/validate.rs (1)
lib/api/src/rest/validate.rs (13)
validate(15-21)validate(25-30)validate(34-45)validate(49-59)validate(63-83)validate(87-93)validate(97-101)validate(105-133)validate(137-142)validate(146-150)validate(154-170)validate(174-232)validate(236-247)
lib/api/src/rest/schema.rs (1)
lib/api/src/rest/validate.rs (13)
validate(15-21)validate(25-30)validate(34-45)validate(49-59)validate(63-83)validate(87-93)validate(97-101)validate(105-133)validate(137-142)validate(146-150)validate(154-170)validate(174-232)validate(236-247)
🔇 Additional comments (14)
lib/api/src/grpc/validate.rs (2)
337-349:query::Variantvalidation looks complete – just ensure new variants fail to compileGood pattern: by enumerating every variant, the compiler enforces updates when proto changes. 👍
368-398: Expression variant validation mirrors REST – looks goodThe exhaustive match properly cascades validation into nested expressions and will fail to compile on new variants. Nicely done.
lib/api/src/rest/validate.rs (1)
106-110: Great catch: formula expression now validated firstAdding
formula.validate()?ensures invalid expressions are rejected before inspecting defaults. ✔️lib/api/src/grpc/qdrant.rs (7)
4126-4133: LGTM: Adding validation for VectorInputThe addition of
validator::Validatederive andvalidate(nested)attribute is a good approach for ensuring the validation of vector inputs in queries, which is a core part of this PR's objective.
5128-5193: Well-structured validation for query input typesThe systematic addition of validation to the recommendation, context, and discovery input types ensures proper validation across all the query input components. This is especially important for complex nested structures like these, where validation errors could otherwise be missed.
5203-5204: Good catch: Adding nested validation for Expression variantsAdding validation for expression variants ensures that all types of expressions in queries will be properly validated, which is crucial for the correctness of complex filter conditions.
5279-5342: Comprehensive validation for expression operatorsThe validation has been thoroughly applied to all mathematical expression types (MultExpression, SumExpression, DivExpression, PowExpression, DecayParamsExpression). This systematic approach ensures that complex expressions used in queries will be properly validated.
5343-5350: Core improvement: Query validationAdding validation to the top-level Query struct with nested validation for its variants is a central component of this PR. This ensures that all query types will be properly validated regardless of which variant is used.
5395-5396: Consistent validation across different request typesAdding nested validation to query fields in different request structures ensures consistent validation behavior across the API, which aligns perfectly with the PR objective.
Also applies to: 5434-5435, 5511-5512
9786-9789: Extended validation to QueryShardPointsAdding validation to the nested message types within QueryShardPoints completes the comprehensive validation coverage across all parts of the Query API.
lib/api/src/rest/schema.rs (4)
475-530: Well-structured validation added to query types.The addition of
Validatederive macros and#[validate(nested)]annotations to the query wrapper structs ensures proper validation of nested components in complex queries. This is a good approach to ensure data integrity throughout the validation chain.
695-769: Comprehensive validation for expression types.The addition of validation to all expression types ensures consistent validation across the entire expression tree. This is particularly important for complex mathematical expressions where nested validation helps catch errors at all levels of the expression hierarchy.
771-787: Good use of nested validation for decay expressions.Adding validation to the different decay expression types provides consistent validation for these specialized functions. The nested validation will ensure that all components of these expressions are properly validated.
475-787:Details
✅ Verification successful
Verify comprehensive validation for all query paths.
This change adds validation to numerous structs, but let's ensure all code paths in the
validatemethod forQueryenum are covered by these changes.
🏁 Script executed:
#!/bin/bash # Verify all query variants are properly validated # Check if all validation functions are implemented for query types echo "Checking Query validation implementation in validate.rs:" rg -A 10 "impl Validate for Query" lib/api/src/rest/validate.rs # Check which query types have the Validate derive echo -e "\nChecking Query types with Validate derive in schema.rs:" rg "#\[derive.*Validate.*\]" lib/api/src/rest/schema.rs | grep -E "struct (.*Query|.*Expression)"Length of output: 1276
🏁 Script executed:
#!/bin/bash # List all variants of the Query enum in schema.rs echo "=== schema.rs: enum Query variants ===" rg -A 20 "enum Query" lib/api/src/rest/schema.rs # Show the full impl Validate for Query in validate.rs echo -e "\n=== validate.rs: impl Validate for Query ===" rg -A 20 "impl Validate for Query" lib/api/src/rest/validate.rsLength of output: 2233
🏁 Script executed:
#!/bin/bash # Print the full Query enum definition to verify all variants rg -n "pub enum Query" -A50 lib/api/src/rest/schema.rsLength of output: 1845
All
Queryvariants are properly validated
TheValidatederive on each*Querystruct and theimpl Validate for Queryinvalidate.rscover every enum variant (Nearest,Recommend,Discover,Context,OrderBy,Fusion,Formula, andSample). No validation paths are missing.
| impl Validate for super::qdrant::vector_input::Variant { | ||
| fn validate(&self) -> Result<(), ValidationErrors> { | ||
| match self { | ||
| grpc::vector_input::Variant::Id(_) | ||
| | grpc::vector_input::Variant::Dense(_) | ||
| | grpc::vector_input::Variant::Document(_) | ||
| | grpc::vector_input::Variant::Image(_) | ||
| | grpc::vector_input::Variant::Object(_) => Ok(()), | ||
| grpc::vector_input::Variant::Sparse(sparse_vector) => sparse_vector.validate(), | ||
| grpc::vector_input::Variant::MultiDense(multi_dense_vector) => { | ||
| multi_dense_vector.validate() | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Skip-validation for several variants may hide future bugs
Dense, Document, Image, and Object variants currently bypass validation even though their inner types may implement Validate (REST side already calls it).
For consistency and to avoid silent acceptance of invalid payloads, delegate validation to the nested messages just like you do for Sparse and MultiDense.
- grpc::vector_input::Variant::Id(_)
- | grpc::vector_input::Variant::Dense(_)
- | grpc::vector_input::Variant::Document(_)
- | grpc::vector_input::Variant::Image(_)
- | grpc::vector_input::Variant::Object(_) => Ok(()),
+ grpc::vector_input::Variant::Id(_) => Ok(()),
+ grpc::vector_input::Variant::Dense(dense) => dense.validate(),
+ grpc::vector_input::Variant::Document(doc) => doc.validate(),
+ grpc::vector_input::Variant::Image(image) => image.validate(),
+ grpc::vector_input::Variant::Object(obj) => obj.validate(),This mirrors the REST implementation and guarantees that any future validation added to those structs is actually executed.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl Validate for super::qdrant::vector_input::Variant { | |
| fn validate(&self) -> Result<(), ValidationErrors> { | |
| match self { | |
| grpc::vector_input::Variant::Id(_) | |
| | grpc::vector_input::Variant::Dense(_) | |
| | grpc::vector_input::Variant::Document(_) | |
| | grpc::vector_input::Variant::Image(_) | |
| | grpc::vector_input::Variant::Object(_) => Ok(()), | |
| grpc::vector_input::Variant::Sparse(sparse_vector) => sparse_vector.validate(), | |
| grpc::vector_input::Variant::MultiDense(multi_dense_vector) => { | |
| multi_dense_vector.validate() | |
| } | |
| } | |
| } | |
| } | |
| impl Validate for super::qdrant::vector_input::Variant { | |
| fn validate(&self) -> Result<(), ValidationErrors> { | |
| match self { | |
| grpc::vector_input::Variant::Id(_) => Ok(()), | |
| grpc::vector_input::Variant::Dense(dense) => dense.validate(), | |
| grpc::vector_input::Variant::Document(doc) => doc.validate(), | |
| grpc::vector_input::Variant::Image(image) => image.validate(), | |
| grpc::vector_input::Variant::Object(obj) => obj.validate(), | |
| grpc::vector_input::Variant::Sparse(sparse_vector) => sparse_vector.validate(), | |
| grpc::vector_input::Variant::MultiDense(multi_dense_vector) => { | |
| multi_dense_vector.validate() | |
| } | |
| } | |
| } | |
| } |
This PR adds missing Query API validation.
The following existing validation are reused for all types of queries.