feat(clp-s::timestamp_parser): Add support for parsing date-time and epoch timestamps without timezones.#1564
Conversation
… expected content.
…crosecond parsing.
…on numeric patterns.
WalkthroughAdds bounded and variable-length numeric-prefix parsers, new time-component constants and parse state, AM/PM and 12/24‑hour parsing, subsecond and epoch token handling with scaling and validation, plus expanded tests covering many parsing patterns and expected epoch outputs. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Suite
participant Parser as TimestampParser
participant Utils as NumericUtils
participant State as ParseState
participant Epoch as EpochAssembler
Test->>Parser: parse(input, pattern)
Parser->>State: initialize parse state (parsed_*, flags)
alt pattern contains date/time tokens
Parser->>Utils: parse numeric prefix (bounded/unbounded)
Utils-->>Parser: value + consumed_length or error
Parser->>State: assign parsed_hour/minute/second/subsecond
alt AM/PM token present
Parser->>State: set optional_part_of_day_idx and uses_12_hour_clock
end
end
alt pattern contains epoch token
Parser->>Utils: parse epoch numeric token
Utils-->>Parser: epoch_value
Parser->>Epoch: scale to nanoseconds -> parsed_epoch_nanoseconds
end
Parser->>State: validate ranges and mode consistency
Parser->>Epoch: combine date/time + subsecond -> final_epoch_ns
Epoch-->>Parser: final_epoch_ns or error
Parser-->>Test: return (pattern, final_epoch_ns) or error
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp(8 hunks)components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/TimestampParser.cpp
🧠 Learnings (4)
📚 Learning: 2025-01-30T19:26:33.869Z
Learnt from: davemarco
Repo: y-scope/clp PR: 700
File: components/core/src/clp/streaming_archive/ArchiveMetadata.hpp:153-155
Timestamp: 2025-01-30T19:26:33.869Z
Learning: When working with constexpr strings (string literals with static storage duration), std::string_view is the preferred choice for member variables as it's more efficient and safe, avoiding unnecessary memory allocations.
Applied to files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2024-11-26T19:12:43.982Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with `std::array`.
Applied to files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2024-11-01T03:26:26.386Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Applied to files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
📚 Learning: 2025-02-10T22:36:08.496Z
Learnt from: davemarco
Repo: y-scope/clp PR: 646
File: components/core/src/clp/streaming_archive/single_file_archive/writer.cpp:161-171
Timestamp: 2025-02-10T22:36:08.496Z
Learning: When using std::array, size checking between arrays is enforced at compile-time through the type system, making additional static_assert size checks redundant.
Applied to files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/TimestampParser.cpp
🧬 Code graph analysis (1)
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (2)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (2)
parse_timestamp(248-756)parse_timestamp(248-252)components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (1)
parse_timestamp(85-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: package-image
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: lint-check (macos-15)
LinZhihao-723
left a comment
There was a problem hiding this comment.
The implementation lgtm in general. Had a few style-wise comments.
Will review test cases shortly.
| case 'L': | ||
| case 'C': | ||
| case 'N': | ||
| case 'p': { |
There was a problem hiding this comment.
Can we add an inline comment to explain which case it is like we did in the previous PR?
| // Do not allow trailing unmatched content. | ||
| if (pattern_idx != pattern.size() || timestamp_idx != timestamp.size()) { | ||
| return ErrorCode{ErrorCodeEnum::IncompatibleTimestampPattern}; | ||
| } |
There was a problem hiding this comment.
Shall we move this before any format changes?
| number_type_representation = true; | ||
| break; | ||
| } | ||
| case 'L': { |
There was a problem hiding this comment.
Just to confirm: For L, C and N, we will allow negative numbers to indicate timestamps before Unix epoch, right?
There was a problem hiding this comment.
Yes, that's correct.
| std::string_view str, | ||
| size_t max_num_digits | ||
| ) -> ystdlib::error_handling::Result<std::pair<int, size_t>> { | ||
| constexpr int cTen{10}; |
There was a problem hiding this comment.
Can we return int64_t since it will be converted to nanosec through multiplication eventually? Making two new helpers return the same type would also lead to less confusion...
| constexpr std::array cPowersOfTen | ||
| = {1, 10, 100, 1000, 10'000, 100'000, 1'000'000, 10'000'000, 100'000'000, 1'000'000'000}; |
There was a problem hiding this comment.
Shall we explicitly mark the type as int64_t?
| parsed_epoch_nanoseconds = number | ||
| * cPowersOfTen.at( | ||
| cNumNanosecondPrecisionSubsecondDigits | ||
| - cNumMicrosecondPrecisionSubsecondDigits | ||
| ); |
There was a problem hiding this comment.
btw this factor can be made as a constexpr:
| parsed_epoch_nanoseconds = number | |
| * cPowersOfTen.at( | |
| cNumNanosecondPrecisionSubsecondDigits | |
| - cNumMicrosecondPrecisionSubsecondDigits | |
| ); | |
| constexpr auto cFactor{cPowersOfTen | |
| [cNumNanosecondPrecisionSubsecondDigits | |
| - cNumMicrosecondPrecisionSubsecondDigits]}; | |
| parsed_epoch_nanoseconds = number * cFactor; |
LinZhihao-723
left a comment
There was a problem hiding this comment.
The overall test cases also lgtm.
Will approve once all comments are resolved.
| assert_specifier_accepts_valid_content('N', negative_epoch_timestamps); | ||
| } | ||
|
|
||
| std::vector<ExpectedParsingResult> const expected_parsing_results{ |
There was a problem hiding this comment.
Can we move this vector inside the section? Or do u plan to reuse it in other test sections?
There was a problem hiding this comment.
I was undecided -- depends on whether we also want to test round-tripping and accuracy for deducing timestamp patterns based on a list of predefined patterns in the section below. I'm now leading towards testing those all at the same time though, because then we can also test that parsing using the list of predefined patterns (which may contain CAT sequences) yields the same result as parsing using the specified pattern (which does not contain CAT sequences).
I'll move it inside.
There was a problem hiding this comment.
kk sg. We can move it out again in later PRs if needed.
| auto const timestamp{fmt::format("{}a", day_in_week_timestamp)}; | ||
| auto const result{parse_timestamp(timestamp, "\\d \\aa", generated_pattern)}; | ||
| auto const result{parse_timestamp(timestamp, cPattern, generated_pattern)}; | ||
| REQUIRE(false == result.has_error()); |
There was a problem hiding this comment.
| REQUIRE(false == result.has_error()); | |
| REQUIRE_FALSE(result.has_error()); |
|
|
||
| /** | ||
| * @param num_digits | ||
| * @return All of the padded numbers with `num_digits` digits having a single unique digit. |
There was a problem hiding this comment.
| * @return All of the padded numbers with `num_digits` digits having a single unique digit. | |
| * @return A vector containing all digit numbers (1 to 9) padded to the specified `num_digits` | |
| * length with the digit. |
Nit: u can apply it if u think this description is better
There was a problem hiding this comment.
Probably won't use this exact phrasing, but I'll incorporate the "A vector of" part.
Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (2)
171-200: CRITICAL: Guard against overflow in convert_positive_bounded_variable_length_string_prefix_to_number.This function performs
converted_value *= cTenandconverted_value += ...without checking for overflow. Large inputs can overflow int64_t before validation, causing undefined behaviour. This issue was flagged in a previous review with detailed fix suggestions.Please accumulate the magnitude in an unsigned type with explicit
std::numeric_limitsguards and convert back to int64_t only after verifying the value is representable.
202-244: CRITICAL: Guard against overflow in convert_variable_length_string_prefix_to_number.This function performs
converted_value *= cTenandconverted_value += ...without checking for overflow. Inputs such as"99999999999999999999"or"-9223372036854775808"can overflow before rejection, yielding undefined behaviour. This issue was flagged in a previous review with detailed fix suggestions.Please use unsigned arithmetic with explicit magnitude checks for both positive and negative values, handling the special case of
INT64_MINseparately.components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (1)
182-184: Consider using REQUIRE_FALSE for consistency.Similar to earlier test assertions, consider using
REQUIRE_FALSE(result.has_error())instead of the double-negative for better readability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp(8 hunks)components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp(6 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
🧠 Learnings (7)
📚 Learning: 2024-10-13T09:29:56.553Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/tests/test-ir_encoding_methods.cpp:1283-1286
Timestamp: 2024-10-13T09:29:56.553Z
Learning: `clp::ffi::SchemaTree::Node::id_t` is of type `uint32_t`, so adding `1` to `INT32_MAX` is safe and does not cause integer overflow.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2025-09-11T13:15:39.930Z
Learnt from: anlowee
Repo: y-scope/clp PR: 1176
File: components/core/src/clp_s/ColumnWriter.cpp:67-72
Timestamp: 2025-09-11T13:15:39.930Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, when using sizeof with types like variable_dictionary_id_t, always use the fully qualified namespace (clp::variable_dictionary_id_t) to match the variable declaration and maintain consistency with other column writers in the codebase.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2024-11-26T19:12:43.982Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with `std::array`.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
📚 Learning: 2025-01-30T19:26:33.869Z
Learnt from: davemarco
Repo: y-scope/clp PR: 700
File: components/core/src/clp/streaming_archive/ArchiveMetadata.hpp:153-155
Timestamp: 2025-01-30T19:26:33.869Z
Learning: When working with constexpr strings (string literals with static storage duration), std::string_view is the preferred choice for member variables as it's more efficient and safe, avoiding unnecessary memory allocations.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
📚 Learning: 2025-02-10T22:36:08.496Z
Learnt from: davemarco
Repo: y-scope/clp PR: 646
File: components/core/src/clp/streaming_archive/single_file_archive/writer.cpp:161-171
Timestamp: 2025-02-10T22:36:08.496Z
Learning: When using std::array, size checking between arrays is enforced at compile-time through the type system, making additional static_assert size checks redundant.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cppcomponents/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
📚 Learning: 2024-10-14T03:45:21.323Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 558
File: components/core/tests/test-ffi_KeyValuePairLogEvent.cpp:331-335
Timestamp: 2024-10-14T03:45:21.323Z
Learning: When reviewing code, ensure that suggestions about unnecessary use of `std::move` with `std::shared_ptr` are only made if `std::move` is actually used in the code.
Applied to files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
📚 Learning: 2024-11-01T03:26:26.386Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
Applied to files:
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp
🧬 Code graph analysis (1)
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (2)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (2)
parse_timestamp(248-753)parse_timestamp(248-252)components/core/src/clp_s/timestamp_parser/TimestampParser.hpp (1)
parse_timestamp(85-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: build (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: build-macos (macos-15, true)
🔇 Additional comments (17)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (12)
33-46: LGTM - Time component constants are well-defined.The constants properly define bounds for hour, minute, second, and subsecond components. The inclusion of 60 as the maximum second value correctly accommodates leap seconds.
89-92: LGTM - Constants for AM/PM and powers of ten are properly defined.The arrays are well-structured and use appropriate types (string_view for AM/PM, int64_t for powers of ten).
259-271: LGTM - Parsing state variables are appropriately declared and initialized.The new variables properly track time components, AM/PM indicator, epoch representation, and clock format, supporting the extended parsing capabilities.
415-423: LGTM - AM/PM parsing is correctly implemented.The case properly matches and stores the AM/PM indicator for use with 12-hour clock formats.
424-509: LGTM - Hour parsing for 24-hour and 12-hour formats is correctly implemented.All four hour format specifiers (H, k, I, l) properly validate field length, parse with appropriate padding, enforce correct ranges, and set the necessary flags for clock format tracking.
510-547: LGTM - Minute and second parsing is correctly implemented.Both format specifiers properly validate field length and ranges, with the second parser correctly accommodating leap seconds (maximum value 60).
548-620: LGTM - Subsecond parsing for all precision levels is correctly implemented.The fixed-length cases (3, 6, 9-digit) properly scale values to nanoseconds using the powers-of-ten array. The variable-length case 'T' correctly rejects trailing zeros and applies appropriate scaling based on the number of digits consumed.
621-673: LGTM - Epoch format parsing is correctly implemented.All four epoch format specifiers (E, L, C, N) properly parse variable-length signed integers, apply correct scaling factors to convert to nanoseconds, and support negative values for pre-epoch timestamps.
706-710: LGTM - Validation ensures 12-hour clock and AM/PM consistency.This check correctly enforces that 12-hour clock formats must have an AM/PM indicator and 24-hour formats must not, preventing malformed timestamp patterns.
712-720: LGTM - Epoch timestamp handling with subseconds is correctly implemented.The logic properly combines epoch values with subsecond components, correctly subtracting subseconds for negative epochs and adding them for positive epochs.
722-725: LGTM - 12-hour to 24-hour clock conversion is correct.The formula properly converts 12-hour format with AM/PM to 24-hour format, correctly handling edge cases like 12 AM (midnight) and 12 PM (noon).
732-735: LGTM - Time point construction properly combines date and time components.The implementation correctly constructs the final time point by adding all parsed time components (hours, minutes, seconds, subsecond nanoseconds) to the date, using appropriate chrono duration types.
components/core/src/clp_s/timestamp_parser/test/test_TimestampParser.cpp (5)
18-31: LGTM - Test data structure is appropriately defined.The ExpectedParsingResult struct provides a clear structure for organizing test cases with timestamp, pattern, and expected epoch value.
53-70: LGTM - Test helper function declarations are clear.The function declarations properly use [[nodiscard]] and have clear documentation explaining their purpose and return values.
98-114: LGTM - Test helper implementations correctly generate test data.Both functions efficiently generate the expected test data patterns: triangles of repeated digits for variable-length testing and padded single-digit sequences for fixed-length testing.
186-250: LGTM - Comprehensive test coverage for new format specifiers.The tests thoroughly exercise all new format specifiers including:
- 24-hour and 12-hour clock formats with various padding
- AM/PM indicators with both zero-padded and space-padded hours
- Minutes and seconds (including leap seconds)
- All subsecond precisions and variable-length fractional seconds
- All epoch formats with both positive and negative values
Consider using
REQUIRE_FALSEat lines 207-208 for consistency with test idioms, as mentioned in past review comments.
252-311: LGTM - Excellent accuracy validation with comprehensive real-world test cases.The test section provides thorough validation of parsing accuracy across diverse timestamp formats including:
- Multiple date and time separators
- Various subsecond precisions (milliseconds, microseconds, nanoseconds)
- Month name formats
- Day of week validation
- Epoch formats with subseconds
- Variable-length fractional seconds
Each test case properly validates both the parsed epoch value and the returned pattern.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (1)
202-244: Address the overflow concern from the previous review.This function has the same overflow vulnerability as noted in the past review comment. Please apply the detailed mitigation suggested there, using unsigned accumulation with
std::numeric_limitsguards for both positive and negative values.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp(8 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit configuration file
- Prefer
false == <expression>rather than!<expression>.
Files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
🧠 Learnings (5)
📚 Learning: 2024-10-13T09:29:56.553Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 557
File: components/core/tests/test-ir_encoding_methods.cpp:1283-1286
Timestamp: 2024-10-13T09:29:56.553Z
Learning: `clp::ffi::SchemaTree::Node::id_t` is of type `uint32_t`, so adding `1` to `INT32_MAX` is safe and does not cause integer overflow.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2025-09-11T13:15:39.930Z
Learnt from: anlowee
Repo: y-scope/clp PR: 1176
File: components/core/src/clp_s/ColumnWriter.cpp:67-72
Timestamp: 2025-09-11T13:15:39.930Z
Learning: In components/core/src/clp_s/ColumnWriter.cpp, when using sizeof with types like variable_dictionary_id_t, always use the fully qualified namespace (clp::variable_dictionary_id_t) to match the variable declaration and maintain consistency with other column writers in the codebase.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2024-11-26T19:12:43.982Z
Learnt from: LinZhihao-723
Repo: y-scope/clp PR: 486
File: components/core/tests/test-error_handling.cpp:37-38
Timestamp: 2024-11-26T19:12:43.982Z
Learning: In the CLP project, C++20 is used, allowing the utilization of C++20 features such as class template argument deduction (CTAD) with `std::array`.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2025-01-30T19:26:33.869Z
Learnt from: davemarco
Repo: y-scope/clp PR: 700
File: components/core/src/clp/streaming_archive/ArchiveMetadata.hpp:153-155
Timestamp: 2025-01-30T19:26:33.869Z
Learning: When working with constexpr strings (string literals with static storage duration), std::string_view is the preferred choice for member variables as it's more efficient and safe, avoiding unnecessary memory allocations.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
📚 Learning: 2025-02-10T22:36:08.496Z
Learnt from: davemarco
Repo: y-scope/clp PR: 646
File: components/core/src/clp/streaming_archive/single_file_archive/writer.cpp:161-171
Timestamp: 2025-02-10T22:36:08.496Z
Learning: When using std::array, size checking between arrays is enforced at compile-time through the type system, making additional static_assert size checks redundant.
Applied to files:
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: manylinux_2_28-x86_64-dynamic-linked-bins
- GitHub Check: musllinux_1_2-x86_64-static-linked-bins
- GitHub Check: manylinux_2_28-x86_64-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: musllinux_1_2-x86_64-dynamic-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: package-image
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: antlr-code-committed (macos-15)
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (6)
components/core/src/clp_s/timestamp_parser/TimestampParser.cpp (6)
33-92: LGTM!The time component bounds and subsecond precision constants are well-defined. The inclusion of 60 as the maximum second value correctly accommodates leap seconds, and the powers of ten array provides the necessary scaling factors for subsecond conversions.
117-141: LGTM!The function declarations for variable-length numeric prefix parsing are well-documented with clear parameter descriptions and error code specifications.
415-547: LGTM!The time component parsing (AM/PM, hours, minutes, seconds) follows a consistent and robust pattern: bounds checking before substring extraction, range validation after parsing, and proper index advancement. The 12-hour clock flag is correctly set for cases 'I' and 'l'.
604-620: LGTM with a note on safety.The variable-length fractional seconds parsing correctly uses the bounded helper function and validates that the last digit is not '0' to enforce the no-trailing-zeroes requirement. The cast to
intat line 618 should be safe given the maximum value constraints, but as noted in past comments, usingint64_tforparsed_subsecond_nanosecondsthroughout would eliminate the cast and improve consistency.Based on past review comments.
706-725: LGTM!The validation logic correctly enforces that 12-hour clock patterns must include AM/PM and vice versa. The epoch time calculation properly handles negative timestamps by subtracting the subsecond component, and the 12-hour to 24-hour conversion using modulo arithmetic is accurate.
727-752: LGTM!The final time-point construction correctly combines the parsed date and time components, validates the day of week when present, and converts to epoch nanoseconds. The use of the
datelibrary's type-safe operations ensures proper calendar arithmetic.
LinZhihao-723
left a comment
There was a problem hiding this comment.
For the PR title, how about the following to be more explicit about what's been implemented?
feat(clp-s::timestamp_parser): Add implementation for day time parsing.
How about |
sgtm |
|
Accidentally hit close while checking on the CI status in the mobile client. |
…epoch timestamps without timezones. (y-scope#1564) Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Description
This PR follows up on #1385 by implementing the remainder of the initial set of format specifier excepting timezones, and CAT sequences.
Since the set of supported format specifiers is nearly complete, we also add tests for parsing all of the unique timestamps from
tests/test-TimestampPattern.cpp, with some additional testcases focused on numeric timestamps.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Bug Fixes
Tests