Skip to content

fix: improve datetime_range RFC3339 parse error#7919

Merged
timvisee merged 11 commits intoqdrant:devfrom
krapcys1-maker:fix-better-datetime-error-3531-dev
Jan 22, 2026
Merged

fix: improve datetime_range RFC3339 parse error#7919
timvisee merged 11 commits intoqdrant:devfrom
krapcys1-maker:fix-better-datetime-error-3531-dev

Conversation

@krapcys1-maker
Copy link
Contributor

@krapcys1-maker krapcys1-maker commented Jan 15, 2026

Fixes #3531

Problem

When datetime_range in REST/JSON filter contains an invalid datetime (e.g. missing timezone offset / malformed),
the API returned a generic serde error like:
data did not match any variant ... Condition.

Solution

  • Make datetime parsing errors explicit and user-friendly.
  • The error now includes the invalid value, mentions RFC3339, and provides an example format.
  • Prevent untagged enum deserialization from swallowing datetime parse errors.

Example error:
'2014-01-01T00:00:00BAD' does not match accepted datetime format (RFC3339). Example: 2014-01-01T00:00:00Z

Tests

  • cargo test -p segment
  • I checked there are no duplicate PRs for this issue.
  • I ran tests locally (cargo test -p segment).

/claim #3531

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f36467b and c16c0e6.

📒 Files selected for processing (1)
  • lib/segment/src/types.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

  • lib/segment/src/types.rs
🧠 Learnings (2)
📚 Learning: 2025-08-23T22:24:44.276Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7100
File: tests/openapi/test_fts.py:255-263
Timestamp: 2025-08-23T22:24:44.276Z
Learning: In Qdrant's OpenAPI schema, the `filter.must` field accepts either a single Condition object or an array of Condition objects, not just arrays. The schema uses `anyOf` to allow both formats.

Applied to files:

  • lib/segment/src/types.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/segment/src/types.rs
🧬 Code graph analysis (1)
lib/segment/src/types.rs (1)
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (1)
  • value (158-162)
🔇 Additional comments (4)
lib/segment/src/types.rs (4)

84-102: LGTM - clear RFC3339 error messaging with minor consideration.

The use of Cow<'de, str> and the improved error message format meet the PR objectives. The error now clearly shows the invalid value and provides an example.

One minor observation: the error states "RFC3339" format, but from_str (lines 107-136) actually accepts several formats beyond strict RFC3339 (e.g., YYYY-MM-DD HH:MM:SS without timezone). This is fine since RFC3339 is the canonical/recommended format, but worth noting for documentation purposes.


3303-3316: LGTM - exhaustive match handling.

The From<ConditionUntagged> for Condition implementation uses explicit exhaustive matching for all variants, which aligns with the coding guidelines to prefer explicit matches over catch-all _ arms. This ensures compile-time safety when new variants are added.


4077-4093: LGTM - good test coverage for the PR objective.

The test effectively validates that invalid datetime values in range filters produce user-friendly error messages containing:

  1. The "RFC3339" format identifier
  2. The actual invalid input value
  3. An example of the correct format

This directly addresses the issue #3531 requirement for descriptive error messages.


3370-3374: Remove the "filter" key check or fix the deserialization logic at line 3370.

The code checks if obj.contains_key("filter") and attempts to deserialize the entire value as Filter. However, this will always fail because Filter has #[serde(deny_unknown_fields)] and does not have a "filter" field—it expects must, should, must_not, min_should at the top level. In all test cases, Filter conditions are deserialized with fields directly present (e.g., {"must": [...], ...}), not wrapped under a "filter" key. The untagged fallback correctly handles this. Either remove this dead code or clarify the intended JSON structure with a test case.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR.

I have one review remark, please see below:

@krapcys1-maker
Copy link
Contributor Author

I pushed a small follow-up commit to reduce the boilerplate in the custom Condition deserializer.
The explicit contains_key routing is now wrapped in a tiny macro, keeping the same ordering (FieldCondition first) so datetime parse errors remain visible, but making it easier to maintain.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@lib/segment/src/types.rs`:
- Around line 84-99: The Deserialize impl for DateTimePayloadType always treats
input as a string which breaks binary formats; update the impl for
DateTimePayloadType::deserialize to check deserializer.is_human_readable() and
only parse strings via DateTimePayloadType::from_str when human-readable,
otherwise delegate to chrono::DateTime::<Utc>::deserialize(deserializer) (or
DateTime::deserialize) and convert that to DateTimePayloadType so
CBOR/MessagePack numeric timestamps remain compatible; keep the existing error
message/path for the human-readable branch and ensure you use the same
DateTimePayloadType::from_str conversion logic.

@qdrant qdrant deleted a comment from coderabbitai bot Jan 20, 2026
@timvisee
Copy link
Member

@krapcys1-maker could you please resolve the outstanding linter issues?

See:

cargo clippy --workspace --all-targets -- -D warnings

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 20, 2026

📝 Walkthrough

Walkthrough

The changes rework deserialization in lib/segment/src/types.rs: DateTimePayloadType gains explicit human-readable and binary deserialization paths (RFC3339-aware, clearer errors); RangeInterface and Condition remove derived Deserialize and receive custom Deserialize implementations that use private untagged enums (RangeInterfaceUntagged, ConditionUntagged) to distinguish float vs datetime ranges and to surface RFC3339 parse errors for FieldCondition. Several regression and binary roundtrip tests were added. Other files include large formatting-only edits and minor import/lint adjustments.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • generall
  • ffuugoo
  • timvisee
🚥 Pre-merge checks | ✅ 3 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR contains minor formatting changes in checkfs.rs and text_index.rs that appear to be outside the scope of improving datetime error messaging, though these changes are non-functional. Consider removing unrelated formatting changes in lib/common/memory/src/checkfs.rs and lib/segment/src/index/field_index/full_text_index/text_index.rs to keep the PR focused on the datetime error improvement.
Docstring Coverage ⚠️ Warning Docstring coverage is 73.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main change: improving the error message for invalid datetime_range values by making RFC3339 parse errors explicit and user-friendly.
Description check ✅ Passed The description is well-related to the changeset, clearly explaining the problem, solution, and the specific error improvement made in the PR.
Linked Issues check ✅ Passed The PR successfully implements the core requirement from #3531: transforming generic serde errors into clear, user-friendly datetime format errors with examples and RFC3339 references.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@krapcys1-maker krapcys1-maker force-pushed the fix-better-datetime-error-3531-dev branch from 80d4467 to 6fb5b12 Compare January 20, 2026 17:25
@krapcys1-maker
Copy link
Contributor Author

Hey! I fixed the clippy uninlined_format_args warnings in lib/segment/src/types.rs and added regression tests around datetime deserialization (binary + JSON fallback).
CI should be clean now.
I also reverted the unrelated formatting-only changes so the PR stays focused.

@timvisee
Copy link
Member

Thank you.

The approach is still more complex than I'd like for this purpose, but I guess we cannot easily simplify it because of how serde works. So I'm fine with keeping it like this.

@timvisee
Copy link
Member

Thanks a lot! Merging now. 😄

@timvisee timvisee merged commit 55352a5 into qdrant:dev Jan 22, 2026
15 checks passed
@krapcys1-maker
Copy link
Contributor Author

Thank you :)

generall pushed a commit that referenced this pull request Feb 9, 2026
* fix: improve REST datetime_range RFC3339 error message

* docs: add docstrings for datetime_range parsing

* fix: avoid serde_json intermediary for non-human-readable deserialization

* fix: avoid serde_json intermediary for non-human-readable range deserialization

* refactor(segment): reduce Condition JSON routing boilerplate

* fix(segment): preserve binary datetime deserialization compatibility

* fix(segment): preserve binary datetime deserialization

* test(segment): add datetime deserialization regression tests

* chore: rerun CI

* chore: rerun CI (2)

* refactor(segment): simplify datetime deserialization
@timvisee timvisee mentioned this pull request Feb 17, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants