Conversation
| let request = if let Some(timeout) = timeout { | ||
| request.timeout(timeout) | ||
| } else { | ||
| request | ||
| }; |
There was a problem hiding this comment.
We must also set default timeout here
There was a problem hiding this comment.
this request uses pre-configured service, it alredy have timeout by default
* fix: inference timeout error * chore: cleanup timeout from inference config * cleanup timeout from InferenceConfig * chore: revert config.toml changes * inference_token -> infrence_params for timeout propagation * always use at least some timeout --------- Co-authored-by: generall <andrey@vasnetsov.com>
📝 WalkthroughWalkthroughThis pull request refactors the inference subsystem to introduce an intermediate Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes
Areas requiring extra attention:
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (9)
src/tonic/auth.rs (1)
27-27: Consider adding an import for consistency.Using the full path
crate::common::inference::token::InferenceToken(None)inline works, but adding an import at the top of the file would improve readability and align with typical Rust conventions.use crate::common::auth::{AuthError, AuthKeys}; +use crate::common::inference::token::InferenceToken; type Request = tonic::codegen::http::Request<tonic::transport::Body>;Then simplify the inline usage:
- let inference_token = crate::common::inference::token::InferenceToken(None); + let inference_token = InferenceToken(None);src/common/inference/params.rs (1)
3-7: Consider adding ausestatement forDuration.Using the fully qualified
std::time::Durationinline is valid but verbose. Ausestatement would improve readability.use crate::common::inference::token::InferenceToken; +use std::time::Duration; #[derive(Debug, Clone, PartialEq, Default)] pub struct InferenceParams { pub token: InferenceToken, - pub timeout: Option<std::time::Duration>, + pub timeout: Option<Duration>, }src/actix/api/update_api.rs (1)
344-346: Unnecessary.clone()oninference_token.The
inference_tokenis not used after this point, so it can be moved directly intoInferenceParams::newwithout cloning.// Update operation doesn't have timeout yet - let inference_params = InferenceParams::new(inference_token.clone(), None); + let inference_params = InferenceParams::new(inference_token, None);src/actix/api/query_api.rs (1)
192-226: Slight asymmetry between groups and non-groups inference_params usage
query_points_groupstakesInferenceParamsby value intoconvert_query_groups_request_from_rest, whereas the non-groups path passes a shared&InferenceParams. This is correct but causes an extra clone insidefrom_batch_accum.If you want symmetry and slightly less cloning, you could change
convert_query_groups_request_from_restto accept&InferenceParams(mirroringconvert_query_request_from_rest) and pass&inference_paramshere.src/common/inference/infer_processing.rs (1)
11-70: InferenceParams ownership inBatchAccumInferredis correct; clone overhead is minorPassing
InferenceParamsby value intofrom_objectsand cloning it infrom_batch_accumis functionally fine and keeps theinferAPI simple. GivenInferenceParamsis small (token + optional timeout), clone cost is negligible.If you want to tighten the API, you could also make
from_batch_accumtakeInferenceParamsby value and avoid the internalclone(), updating call sites accordingly.src/tonic/api/update_common.rs (1)
806-854:syncreuses convert_point_struct with InferenceParams; assumptions about no inference should be validated
syncnow passesInferenceParamsintoconvert_point_structwithInferenceType::Update. As the comment notes, this path assumes that syncing uses already-persisted points (no Document/Image/Object fields), so no actual inference calls should be triggered even though the parameters are available.It’s worth double-checking that current
SyncPointsproducers indeed only send persisted vector forms so thatconvert_point_struct’s inference branch is never taken here; otherwise, inference timeouts and external calls could unexpectedly affect sync behavior.src/common/inference/token.rs (1)
1-43: InferenceToken abstraction and extraction helpers look correct and consistentThe
InferenceTokennewtype, ActixFromRequestimpl (usingreq.extensions().get::<InferenceToken>().cloned().unwrap_or_default()), and Tonicextract_tokenhelper all default toInferenceToken(None)when no token is present, giving a uniform, failure-free extraction story across transports. This matches how the gRPC and Actix handlers now consume tokens.As a tiny polish, you might rename the local
api_keyvariable infrom_requesttoinference_tokento better reflect its type.src/tonic/api/points_internal_api.rs (2)
459-464: Unnecessary clone ofinference_token.The
inference_token.clone()is unnecessary sinceInferenceParams::newtakesimpl Into<Option<String>>which works with the token directly.let inference_token = extract_token(&request); - let inference_params = InferenceParams::new(inference_token.clone(), None); + let inference_params = InferenceParams::new(inference_token, None);
481-486: Unnecessary clone ofinference_token.Same as above - the clone is not needed here.
let inference_token = extract_token(&request); - let inference_params = InferenceParams::new(inference_token.clone(), None); + let inference_params = InferenceParams::new(inference_token, None);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (18)
src/actix/api/query_api.rs(5 hunks)src/actix/api/update_api.rs(5 hunks)src/common/auth/mod.rs(1 hunks)src/common/inference/config.rs(1 hunks)src/common/inference/infer_processing.rs(4 hunks)src/common/inference/mod.rs(1 hunks)src/common/inference/params.rs(1 hunks)src/common/inference/query_requests_grpc.rs(4 hunks)src/common/inference/query_requests_rest.rs(4 hunks)src/common/inference/service.rs(10 hunks)src/common/inference/token.rs(1 hunks)src/common/inference/update_requests.rs(8 hunks)src/common/update.rs(9 hunks)src/tonic/api/points_api.rs(13 hunks)src/tonic/api/points_internal_api.rs(12 hunks)src/tonic/api/query_common.rs(7 hunks)src/tonic/api/update_common.rs(10 hunks)src/tonic/auth.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/review-rules.md)
**/*.rs: Prefer explicitSomeType::from(x)over implicitx.into()in Rust
In new code, don't usetransmute_from_u8,transmute_to_u8,transmute_from_u8_to_slice,transmute_from_u8_to_mut_slice,transmute_to_u8_slice- usebytemuckorzerocopycrates instead
Prefer explicit exhaustive matches over catch-all_arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using: _over using..in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)
Files:
src/common/inference/query_requests_rest.rssrc/common/inference/params.rssrc/tonic/api/update_common.rssrc/tonic/auth.rssrc/common/inference/query_requests_grpc.rssrc/tonic/api/points_api.rssrc/common/inference/update_requests.rssrc/common/inference/token.rssrc/tonic/api/query_common.rssrc/actix/api/query_api.rssrc/common/auth/mod.rssrc/actix/api/update_api.rssrc/common/update.rssrc/tonic/api/points_internal_api.rssrc/common/inference/mod.rssrc/common/inference/config.rssrc/common/inference/service.rssrc/common/inference/infer_processing.rs
🧠 Learnings (4)
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.
Applied to files:
src/tonic/api/update_common.rssrc/common/update.rs
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.
Applied to files:
src/common/inference/query_requests_grpc.rssrc/tonic/api/points_api.rs
📚 Learning: 2025-08-10T18:26:12.443Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: docs/redoc/master/openapi.json:13645-13652
Timestamp: 2025-08-10T18:26:12.443Z
Learning: In qdrant, docs/redoc/master/openapi.json is auto-generated from the REST schemas. To change field docs, edit lib/api/src/rest/schema.rs (e.g., add doc comments or #[schemars(description = ...)]). Specifically, UpdateVectors.update_filter lacked a description and should state: "If specified, only update vectors for points that match this filter; points not matching the filter are left unchanged."
Applied to files:
src/actix/api/update_api.rs
📚 Learning: 2025-08-10T18:31:56.855Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/point_ops.rs:501-528
Timestamp: 2025-08-10T18:31:56.855Z
Learning: In Qdrant, batch operations validate that `ids`, `vectors`, and `payloads` (if present) have matching lengths at the REST API level in `lib/api/src/rest/validate.rs` through the `Validate` trait implementation for `Batch`. This validation happens before data is converted to internal structures like `PointInsertOperationsInternal`, so methods operating on these internal structures can safely assume the lengths match.
Applied to files:
src/common/update.rs
🧬 Code graph analysis (12)
src/common/inference/query_requests_rest.rs (2)
src/common/inference/infer_processing.rs (1)
from_batch_accum(63-70)src/common/inference/batch_processing.rs (1)
collect_query_request(168-196)
src/common/inference/params.rs (3)
src/common/inference/service.rs (1)
new(85-102)src/common/inference/config.rs (1)
new(11-17)src/common/inference/token.rs (1)
new(10-12)
src/tonic/api/update_common.rs (1)
src/common/inference/update_requests.rs (1)
convert_point_struct(17-156)
src/common/inference/query_requests_grpc.rs (1)
src/common/inference/infer_processing.rs (1)
from_objects(24-61)
src/tonic/api/points_api.rs (1)
src/common/inference/token.rs (1)
extract_token(38-43)
src/common/inference/update_requests.rs (1)
src/common/inference/infer_processing.rs (1)
from_batch_accum(63-70)
src/common/inference/token.rs (2)
src/common/inference/config.rs (1)
new(11-17)src/common/inference/params.rs (1)
new(10-15)
src/tonic/api/query_common.rs (2)
src/common/inference/query_requests_grpc.rs (2)
convert_query_points_from_grpc(108-181)convert_query_point_groups_from_grpc(27-105)src/actix/api/query_api.rs (1)
query_points(28-103)
src/actix/api/query_api.rs (1)
src/common/inference/query_requests_rest.rs (2)
convert_query_groups_request_from_rest(33-92)convert_query_request_from_rest(94-148)
src/common/update.rs (1)
src/common/inference/update_requests.rs (3)
convert_batch(158-207)convert_point_struct(17-156)convert_point_vectors(209-294)
src/common/inference/config.rs (2)
src/common/inference/service.rs (1)
new(85-102)src/common/inference/params.rs (1)
new(10-15)
src/common/inference/service.rs (1)
src/common/inference/config.rs (1)
new(11-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: lint
- GitHub Check: e2e-tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests-consensus
- GitHub Check: test-consistency
- GitHub Check: Build Qdrant Edge Python bindings
- GitHub Check: integration-tests
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (40)
src/common/inference/query_requests_grpc.rs (2)
23-23: LGTM!The import and parameter changes correctly adopt
InferenceParamsfor propagating inference configuration. Usage aligns with theBatchAccumInferred::from_objectssignature shown in the relevant snippets.Also applies to: 29-30, 63-66
108-111: LGTM!Consistent adoption of
InferenceParamsinconvert_query_points_from_grpc, matching the pattern established in the grouped queries function.Also applies to: 142-145
src/common/auth/mod.rs (1)
14-14: LGTM!Import path correctly updated to reference
InferenceTokenfrom its new dedicated module location.src/actix/api/update_api.rs (3)
20-21: LGTM!Clean separation of imports for
InferenceParamsandInferenceTokenfrom their respective modules.
52-54: LGTM!Good use of
InferenceParamswrapper with a clarifying comment explaining why timeout isNone. The pattern correctly passes inference configuration downstream.Also applies to: 62-62
133-134: LGTM!Consistent application of the
InferenceParamspattern inupdate_vectors.Also applies to: 142-142
src/common/inference/params.rs (1)
9-15: The original review comment is incorrect and should be disregarded.The coding guideline to prefer
SomeType::from(x)overx.into()applies to concrete types, not opaque trait parameters. The function parametertoken: impl Into<InferenceToken>has an unknown concrete type at compile time. The.into()method is the only correct pattern here—it uses return type context to infer the target type. UsingInferenceToken::from(token)would fail to compile because the compiler cannot resolve the concrete input type.The current code is idiomatic Rust and correct as-is.
Likely an incorrect or invalid review comment.
src/actix/api/query_api.rs (2)
35-62: InferenceParams construction and reuse inquery_pointslooks consistent
inference_tokenis injected via ActixFromRequest, wrapped intoInferenceParamsusingparams.timeout(), and the same params are reused for inference conversion and dispatcher calls. Ownership/borrows are sound and avoid repeated construction per query.
105-141: Batch handler reuses a single InferenceParams instance across all searchesIn
query_points_batch, constructinginference_paramsonce and passing&inference_paramsintoconvert_query_request_from_restfor each item is efficient and keeps token/timeout consistent for the whole batch. No lifetime or aliasing issues here.src/common/inference/config.rs (1)
3-16: Timeout asOption<u64>aligns with per-request overrides and keeps defaults intactSwitching
timeouttoOption<u64>plus derivingDefaultand settingtimeout: Noneinnewworks cleanly withInferenceService::new, which already doestimeout.unwrap_or(DEFAULT_INFERENCE_TIMEOUT_SECS). This preserves existing defaults while allowing explicit config-level overrides and coexisting with per-request timeouts inInferenceParams.src/common/inference/mod.rs (1)
3-15: Newparamsandtokenmodules are wired correctly into the inference namespaceExporting
paramsandtokenhere aligns with the newcrate::common::inference::params::InferenceParamsandcrate::common::inference::token::InferenceTokenusages across Actix and Tonic paths. No issues seen.src/common/inference/query_requests_rest.rs (2)
33-53: Groups REST conversion now uses InferenceParams; behavior remains consistent
convert_query_groups_request_from_restcollecting a batch, then callingBatchAccumInferred::from_batch_accum(batch, InferenceType::Search, &inference_params)correctly threads token + optional timeout into inference. The rest of the function remains a pure mapping from inferred vectors toCollectionQueryGroupsRequest, so behavior is unchanged aside from the new timeout-aware inference context.
94-101: Regular REST query conversion correctly delegates to BatchAccumInferred with InferenceParams
convert_query_request_from_restnow accepts&InferenceParamsand forwards it directly intoBatchAccumInferred::from_batch_accum. This matches the updated signature ininfer_processing.rsand keeps the function free of token/timeout details beyond passing the struct through.src/tonic/api/points_api.rs (2)
64-88: Update operations now consistently pass InferenceParams into update_commonFor
upsert,update_vectors, andupdate_batch, extracting the token viaextract_token(&request)and wrapping it asInferenceParams::new(inference_token, None)cleanly passes inference context intoupdate_common. Sharing a singleInferenceParamsper RPC and cloning it inupdate_batchis reasonable given the struct size and keeps per-operation behavior consistent.Also applies to: 133-159, 277-300
591-612: gRPC query methods now propagate per-request timeout into inference*In
query,query_batch, andquery_groups,InferenceParams::newis built from the extracted token plusrequest.get_ref().timeout.map(Duration::from_secs), so the same timeout value drives both core query execution and inference. Access extraction, token extraction, andinto_inner()ordering are all safe.One nuance:
query_batchstill separately maps the destructuredtimeoutintoDurationfor core search while also embedding the originaltimeoutintoInferenceParams. That duplication is intentional but worth keeping in mind if you later change the units or semantics of the RPCtimeoutfield.Also applies to: 618-649, 655-676
src/tonic/api/update_common.rs (1)
40-89: All call sites properly updated; no verification issues foundVerification confirms that the refactor from
InferenceTokentoInferenceParamsis complete across all call sites:
- tonic upsert/update_vectors: Extract token → construct
InferenceParams→ pass to functions ✓- tonic update_batch: Passes
InferenceParamscorrectly ✓- actix update_batch: Receives
InferenceToken, immediately converts toInferenceParams(line 345), passes converted value ✓- sync function: Accepts and uses
inference_paramsconsistently with its comment that "no actual inference" occurs in sync itself (consistency only) ✓The concern about outdated call sites is not applicable; the migration has been completed successfully.
src/tonic/api/query_common.rs (4)
36-36: LGTM!Import updated correctly to use the new
InferenceParamstype from the params module.
776-830: LGTM!The
queryfunction signature and implementation are correctly updated to useInferenceParams. The parameter is properly passed toconvert_query_points_from_grpc.
832-894: LGTM!The
query_batchfunction correctly usesInferenceParamsand appropriately clones it for each iteration in the loop, which is necessary since it's consumed by eachconvert_query_points_from_grpccall.
896-950: LGTM!The
query_groupsfunction correctly updated to useInferenceParamsand passes it toconvert_query_point_groups_from_grpc.src/tonic/api/points_internal_api.rs (6)
35-36: LGTM!Imports correctly updated to reference the new module structure with
InferenceParamsfromparamsandextract_tokenfromtoken.
57-82: LGTM!The
sync_internalmethod correctly updated to acceptInferenceParamsand pass it to thesyncfunction.
84-110: LGTM!The
upsert_internalmethod correctly usesInferenceParams. The clone on line 106 is appropriate since the params may be reused.
138-164: LGTM!The
update_vectors_internalmethod correctly updated to useInferenceParams.
557-580: LGTM!The
update_batchimplementation correctly createsInferenceParamsonce and clones it for each operation that requires inference. The comment explains the design choice.
797-804: LGTM!The
syncmethod correctly createsInferenceParamswithNonetimeout for internal operations and passes it through. No unnecessary clone here.src/common/inference/update_requests.rs (5)
14-14: LGTM!Import correctly updated to use the new
InferenceParamstype.
17-52: LGTM!The
convert_point_structfunction correctly updated to useInferenceParams. The reference passed tofrom_batch_accumaligns with its signature that takes&InferenceParams.
158-207: LGTM!The
convert_batchfunction correctly usesInferenceParamsand clones it appropriately when iterating over named vectors, since eachconvert_vectorscall consumes the params.
209-294: LGTM!The
convert_point_vectorsfunction correctly updated to useInferenceParamswith proper reference passing tofrom_batch_accum.
373-418: LGTM!The
convert_vectorsfunction correctly updated to useInferenceParamswith proper reference passing tofrom_batch_accum.src/common/inference/service.rs (5)
19-19: LGTM!Import correctly added for the new
InferenceParamstype.
79-102: Default timeout of 10 minutes correctly implemented.This addresses the concern from previous review discussions about always having a timeout. The client is now built with a base timeout (from config or the 10-minute default), ensuring requests never wait indefinitely.
132-176: LGTM!The
infermethod correctly updated to acceptInferenceParamsand propagate it toinfer_remote.
178-242: Per-request timeout override logic looks correct.The implementation properly destructures
InferenceParamsand applies the per-request timeout on top of the base client timeout when provided. This allows callers to specify shorter timeouts for specific requests while still having the default 10-minute safety net.
545-564: LGTM!Test correctly updated to use the new
InferenceParams::newconstructor withNonetimeout.src/common/update.rs (4)
29-29: LGTM!Import correctly updated to use the new
InferenceParamstype from the params module.
277-340: LGTM!The
do_upsert_pointsfunction correctly updated to useInferenceParams. The parameter is properly passed to bothconvert_batchandconvert_point_structdepending on the operation type.
379-423: LGTM!The
do_update_vectorsfunction correctly updated to useInferenceParamsand passes it toconvert_point_vectors.
656-800: LGTM!The
do_batch_update_pointsfunction correctly usesInferenceParamsand appropriately clones it for eachUpsertandUpdateVectorsoperation in the batch loop.
All Submissions:
devbranch. Did you create your branch fromdev?New Feature Submissions:
cargo +nightly fmt --allcommand prior to submission?cargo clippy --all --all-featurescommand?Changes to Core Features: