fix: improve datetime_range RFC3339 parse error#7919
Conversation
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/segment/src/types.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/review-rules.md)
**/*.rs: Prefer explicitSomeType::from(x)over implicitx.into()in Rust
In new code, don't usetransmute_from_u8,transmute_to_u8,transmute_from_u8_to_slice,transmute_from_u8_to_mut_slice,transmute_to_u8_slice- usebytemuckorzerocopycrates instead
Prefer explicit exhaustive matches over catch-all_arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using: _over using..in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)
Files:
lib/segment/src/types.rs
🧠 Learnings (2)
📚 Learning: 2025-08-23T22:24:44.276Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7100
File: tests/openapi/test_fts.py:255-263
Timestamp: 2025-08-23T22:24:44.276Z
Learning: In Qdrant's OpenAPI schema, the `filter.must` field accepts either a single Condition object or an array of Condition objects, not just arrays. The schema uses `anyOf` to allow both formats.
Applied to files:
lib/segment/src/types.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.
Applied to files:
lib/segment/src/types.rs
🧬 Code graph analysis (1)
lib/segment/src/types.rs (1)
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (1)
value(158-162)
🔇 Additional comments (4)
lib/segment/src/types.rs (4)
84-102: LGTM - clear RFC3339 error messaging with minor consideration.The use of
Cow<'de, str>and the improved error message format meet the PR objectives. The error now clearly shows the invalid value and provides an example.One minor observation: the error states "RFC3339" format, but
from_str(lines 107-136) actually accepts several formats beyond strict RFC3339 (e.g.,YYYY-MM-DD HH:MM:SSwithout timezone). This is fine since RFC3339 is the canonical/recommended format, but worth noting for documentation purposes.
3303-3316: LGTM - exhaustive match handling.The
From<ConditionUntagged> for Conditionimplementation uses explicit exhaustive matching for all variants, which aligns with the coding guidelines to prefer explicit matches over catch-all_arms. This ensures compile-time safety when new variants are added.
4077-4093: LGTM - good test coverage for the PR objective.The test effectively validates that invalid datetime values in range filters produce user-friendly error messages containing:
- The "RFC3339" format identifier
- The actual invalid input value
- An example of the correct format
This directly addresses the issue
#3531requirement for descriptive error messages.
3370-3374: Remove the "filter" key check or fix the deserialization logic at line 3370.The code checks
if obj.contains_key("filter")and attempts to deserialize the entirevalueasFilter. However, this will always fail becauseFilterhas#[serde(deny_unknown_fields)]and does not have a "filter" field—it expectsmust,should,must_not,min_shouldat the top level. In all test cases,Filterconditions are deserialized with fields directly present (e.g.,{"must": [...], ...}), not wrapped under a "filter" key. The untagged fallback correctly handles this. Either remove this dead code or clarify the intended JSON structure with a test case.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
timvisee
left a comment
There was a problem hiding this comment.
Thank you for the PR.
I have one review remark, please see below:
|
I pushed a small follow-up commit to reduce the boilerplate in the custom Condition deserializer. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@lib/segment/src/types.rs`:
- Around line 84-99: The Deserialize impl for DateTimePayloadType always treats
input as a string which breaks binary formats; update the impl for
DateTimePayloadType::deserialize to check deserializer.is_human_readable() and
only parse strings via DateTimePayloadType::from_str when human-readable,
otherwise delegate to chrono::DateTime::<Utc>::deserialize(deserializer) (or
DateTime::deserialize) and convert that to DateTimePayloadType so
CBOR/MessagePack numeric timestamps remain compatible; keep the existing error
message/path for the human-readable branch and ensure you use the same
DateTimePayloadType::from_str conversion logic.
|
@krapcys1-maker could you please resolve the outstanding linter issues? See: cargo clippy --workspace --all-targets -- -D warnings |
📝 WalkthroughWalkthroughThe changes rework deserialization in lib/segment/src/types.rs: DateTimePayloadType gains explicit human-readable and binary deserialization paths (RFC3339-aware, clearer errors); RangeInterface and Condition remove derived Deserialize and receive custom Deserialize implementations that use private untagged enums (RangeInterfaceUntagged, ConditionUntagged) to distinguish float vs datetime ranges and to surface RFC3339 parse errors for FieldCondition. Several regression and binary roundtrip tests were added. Other files include large formatting-only edits and minor import/lint adjustments. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
80d4467 to
6fb5b12
Compare
|
Hey! I fixed the clippy uninlined_format_args warnings in lib/segment/src/types.rs and added regression tests around datetime deserialization (binary + JSON fallback). |
|
Thank you. The approach is still more complex than I'd like for this purpose, but I guess we cannot easily simplify it because of how serde works. So I'm fine with keeping it like this. |
|
Thanks a lot! Merging now. 😄 |
|
Thank you :) |
* fix: improve REST datetime_range RFC3339 error message * docs: add docstrings for datetime_range parsing * fix: avoid serde_json intermediary for non-human-readable deserialization * fix: avoid serde_json intermediary for non-human-readable range deserialization * refactor(segment): reduce Condition JSON routing boilerplate * fix(segment): preserve binary datetime deserialization compatibility * fix(segment): preserve binary datetime deserialization * test(segment): add datetime deserialization regression tests * chore: rerun CI * chore: rerun CI (2) * refactor(segment): simplify datetime deserialization
Fixes #3531
Problem
When
datetime_rangein REST/JSON filter contains an invalid datetime (e.g. missing timezone offset / malformed),the API returned a generic serde error like:
data did not match any variant ... Condition.Solution
Example error:
'2014-01-01T00:00:00BAD' does not match accepted datetime format (RFC3339). Example: 2014-01-01T00:00:00ZTests
cargo test -p segmentcargo test -p segment)./claim #3531