feat(kv-ir): Add utilities for kv-pair IR stream search.#856
Conversation
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com>
WalkthroughThis change introduces new utility functions and comprehensive unit tests for evaluating filter expressions against various literal types within the IR stream search module. It adds implementation and declarations for these utilities, expands error code handling with new error cases, and integrates the relevant source and test files into the build system. The new tests validate filter evaluation for integers, floats, booleans, variable-length strings, and encoded text ASTs, ensuring correctness across supported filter operations and literal types. Changes
Sequence Diagram(s)sequenceDiagram
participant TestCase
participant FilterExpr
participant Utils
participant Value
participant ErrorCode
TestCase->>Utils: evaluate_filter_against_literal_type_value_pair(filter, literal_type, value, case_sensitive)
Utils->>Value: Extract operand(s) based on literal_type
Utils->>Utils: Dispatch to type-specific evaluation (int, float, bool, string, encoded text)
Utils->>Utils: Perform filter operation comparison
alt Error encountered
Utils->>ErrorCode: Return error code (e.g., DecodingFailure, Unexpected, Unsupported)
else Success
Utils-->>TestCase: Return boolean result
end
Suggested reviewers
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
components/core/CMakeLists.txt (1)
610-610: Nit: keep test-file list alphabetically sortedFor long source lists a consistent ordering eases grepping and reduces merge conflicts. Consider inserting
test-ffi_ir_stream_search_utils.cppin lexicographical position.components/core/src/clp/ffi/ir_stream/search/utils.cpp (1)
281-297:Objnodes mapped toNullTmay break downstream logicIn
schema_tree_node_type_to_literal_types, an object node currently returnsLiteralType::NullT.
Down-stream code could interpret this as “the only legal literal isnull”, blocking legitimate object literals once they are supported.If object support is still TODO, returning
UnknownT(or at least addingObjectTbehind a feature flag) would avoid silently mis-categorising objects as null.components/core/src/clp/ffi/ir_stream/search/utils.hpp (1)
37-43: Visibility note: function returns depend on schema-node semantics
evaluate_filter_against_literal_type_value_pairoperates on a literal type determined elsewhere.
For callers to use this API safely, please document that:
- The caller must supply a literal type consistent with
valueand the underlying schema node.- The function does not attempt any implicit conversions (e.g., string-to-number).
This clarification will prevent accidental misuse and future bugs.
components/core/tests/test-ffi_ir_stream_search_utils.cpp (1)
146-161: Avoid re-encoding the same string for every test – extract and cache the AST once
get_encoded_text_astperforms a full CLP string encoding every time a value is constructed (lines 314-316).
Given that the reference strings are static test fixtures, the repeated work is unnecessary and slows the test suite.Consider caching the encoded ASTs once per literal type and re-using them:
- std::vector<ValueToMatchedFilterOpsPair> const value_to_matched_filter_ops_pairs{ - {Value{get_encoded_text_ast<encoded_variable_t>(matched)}, {FilterOperation::EQ}}, - {Value{get_encoded_text_ast<encoded_variable_t>(unmatched)}, {FilterOperation::NEQ}}, - }; + static auto const matched_ast = get_encoded_text_ast<encoded_variable_t>(matched); + static auto const unmatched_ast = get_encoded_text_ast<encoded_variable_t>(unmatched); + + std::vector<ValueToMatchedFilterOpsPair> const value_to_matched_filter_ops_pairs{ + {Value{matched_ast}, {FilterOperation::EQ}}, + {Value{unmatched_ast}, {FilterOperation::NEQ}}, + };This keeps the tests deterministic while shaving noticeable time off larger test suites.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
components/core/CMakeLists.txt(2 hunks)components/core/src/clp/ffi/ir_stream/search/utils.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/utils.hpp(1 hunks)components/core/tests/test-ffi_ir_stream_search_utils.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/src/clp/ffi/ir_stream/search/utils.hppcomponents/core/src/clp/ffi/ir_stream/search/utils.cppcomponents/core/tests/test-ffi_ir_stream_search_utils.cpp
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: lint-check (macos-latest)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: lint-check (ubuntu-latest)
- GitHub Check: build (macos-latest)
🔇 Additional comments (1)
components/core/CMakeLists.txt (1)
436-438:Details
❌ Incorrect review comment
Confirm compilation/visibility of the new IR-stream search utilities
utils.cpp/utils.hppare only appended toSOURCE_FILES_unitTest, which means they will be compiled into the test executable but not into any production (installable) target.
If other components (e.g., future FFI bindings) are expected to link against these helpers, please move the two files into the appropriate library target (or create a new one) and link that target both tounitTestand to the consuming binaries.
🏁 Script executed:
#!/bin/bash set -e # Show occurrences of SOURCE_FILES_unitTest in the CMakeLists to verify where utils.cpp/hpp are added rg -n "SOURCE_FILES_unitTest" -C5 components/core/CMakeLists.txtLength of output: 816
🏁 Script executed:
#!/bin/bash set -e # Locate where the IR-stream files are referenced rg -n "ffi/ir_stream" -C3 components/core/CMakeLists.txt # Show context around lines 430-460 to inspect where these files are added sed -n '430,460p' components/core/CMakeLists.txtLength of output: 2847
Ignore suggestion: IR-stream utilities are already in the production library
The files
src/clp/ffi/ir_stream/search/utils.cpp/.hppandsrc/clp/ffi/ir_stream/utils.cpp/.hppappear in the main FFI source list in components/core/CMakeLists.txt (lines 417–442), not just underSOURCE_FILES_unitTest. They are compiled into the production library and available to consumers. No further changes needed.Likely an incorrect or invalid review comment.
gibber9809
left a comment
There was a problem hiding this comment.
Still need to read through tests before finishing this review, but so far don't have any real concerns. Just left one small comment in utils.cpp.
| case LiteralType::ArrayT: | ||
| case LiteralType::NullT: | ||
| case LiteralType::EpochDateT: | ||
| case LiteralType::UnknownT: | ||
| default: | ||
| return false; |
There was a problem hiding this comment.
Do you think its worth adding a comment to differentiate these cases?
E.g. we'll implement search for LiteralType::ArrayT at some point (maybe in this function, maybe somewhere else), but we won't for the other three cases for various reasons.
NullT because this branch should be unreachable by construction (should always be an EXISTS/NEXISTS filter), EpochDateT because schema_tree_node_type_value_pair_to_literal_type only returns (and only has to return) VarStringT and ClpStringT for string types, and UnknownT because it should again be unreachable because column resolution should have failed.
There was a problem hiding this comment.
How about let's merge #863 first, and we can then change this function to return a result instead. For these currently not supported types, we return ErrorCodeEnum::NotImplemented
There was a problem hiding this comment.
Change evaluate_filter_against_literal_type_value_pair to return a result instead. The details are added to the docstring.
- For internal errors, we return
LiteralTypeUnexpected. - For currently unsupported types, we return
LiteralTypeUnsupported.- I understand the concern for returning an error for array type; I think we can add a special handling to ignore this error on the caller end.
gibber9809
left a comment
There was a problem hiding this comment.
A few suggestions to make the docstrings in the tests more clear.
I'd consider expanding a bit more on the check_filter_evaluation_against_X docstrings as well to explain a bit about how they work, but it's not completely necessary and I don't have a great way of writing it up so I'm not going to block approval based on that.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
components/core/src/clp/ffi/ir_stream/search/utils.cpp (1)
342-348: Implementation aligns with project's EXISTS/NEXISTS semantics.As noted in the retrieved learnings, EXISTS always returns true and NEXISTS always returns false in this codebase. This implementation correctly follows that convention rather than checking value.has_value().
🧹 Nitpick comments (1)
components/core/src/clp/ffi/ir_stream/search/utils.cpp (1)
374-379: Consider adding a comment about literal type support.Based on the past review comments, it would be helpful to add a comment explaining which literal types are planned for future implementation versus those that are intentionally unsupported.
case LiteralType::EpochDateT: case LiteralType::ArrayT: + // ArrayT will be implemented in a future update, while EpochDateT is handled differently + // as schema_tree_node_type_value_pair_to_literal_type returns specific string types return ErrorCode{ErrorCodeEnum::LiteralTypeUnsupported}; case LiteralType::NullT: case LiteralType::UnknownT: default: + // NullT should be unreachable (EXISTS/NEXISTS filters only) + // UnknownT indicates column resolution failure return ErrorCode{ErrorCodeEnum::LiteralTypeUnexpected};
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
components/core/CMakeLists.txt(2 hunks)components/core/src/clp/ffi/ir_stream/search/ErrorCode.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/ErrorCode.hpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/utils.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/utils.hpp(1 hunks)components/core/tests/test-ffi_ir_stream_search_utils.cpp(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- components/core/src/clp/ffi/ir_stream/search/ErrorCode.hpp
🚧 Files skipped from review as they are similar to previous changes (3)
- components/core/CMakeLists.txt
- components/core/src/clp/ffi/ir_stream/search/utils.hpp
- components/core/tests/test-ffi_ir_stream_search_utils.cpp
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/src/clp/ffi/ir_stream/search/ErrorCode.cppcomponents/core/src/clp/ffi/ir_stream/search/utils.cpp
🧠 Learnings (1)
components/core/src/clp/ffi/ir_stream/search/utils.cpp (2)
Learnt from: LinZhihao-723
PR: y-scope/clp#856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:340-347
Timestamp: 2025-04-26T02:19:33.275Z
Learning: In the CLP codebase, the FilterOperation::EXISTS and FilterOperation::NEXISTS operations have specific semantics that differ from simply checking if a value exists. EXISTS always returns true and NEXISTS always returns false in the evaluate_filter_against_literal_type_value_pair function. The optional::has_value() is used specifically to differentiate empty objects from concrete values, not for existence checks in filter operations.
Learnt from: LinZhihao-723
PR: y-scope/clp#856
File: components/core/src/clp/ffi/ir_stream/search/utils.cpp:258-266
Timestamp: 2025-04-26T02:21:22.021Z
Learning: In the clp::ffi::ir_stream::search namespace, the design principle is that callers are responsible for type checking, not the called functions. If control flow reaches a function, input types should already be validated by the caller.
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: lint-check (ubuntu-latest)
- GitHub Check: lint-check (macos-latest)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-14, false)
🔇 Additional comments (5)
components/core/src/clp/ffi/ir_stream/search/ErrorCode.cpp (1)
20-23: Clear error messages for the new error codes.The messages for the two new error codes are clear and descriptive, appropriately explaining when these conditions would occur during IR stream search filtering.
components/core/src/clp/ffi/ir_stream/search/utils.cpp (4)
37-38: Good use of C++ concepts to constrain template parameters.Using
std::same_asconcept ensures the template is only instantiated for the intended numerical types, preventing accidental misuse with incompatible types.
170-175: Adheres to project coding guideline for negation.Using
false == wildcard_match_unsafe(...)rather than negating the result follows the project's coding guideline that prefersfalse == <expression>over!<expression>.
187-188: Consistent use of false comparison style.The comparison
false == operand->as_int(...)follows the project's coding guideline that prefersfalse == <expression>over!<expression>.
291-292: Correctly includes multiple string literal types.Properly handles the mapping from the Str schema tree node type to multiple supported literal types using bitwise OR, which allows for flexible matching of string representations.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
components/core/tests/test-ffi_ir_stream_search_utils.cpp (5)
109-109: Fix the typo in the function comment.The function comment contains a grammatical error - it should be "Checks filter evaluation against variable strings" instead of "Checks filter evaluation against and variable strings".
- * Checks filter evaluation against and variable strings. + * Checks filter evaluation against variable strings.
120-120: Fix the typo in the function comment.Similar to the previous comment, the function comment contains a grammatical error - it should be "Checks filter evaluation against encoded text ASTs" instead of "Checks filter evaluation against and encoded text ASTs".
- * Checks filter evaluation against and encoded text ASTs (Clp strings). + * Checks filter evaluation against encoded text ASTs (Clp strings).
289-291: Consider optimizing string concatenation.Creating two temporary strings and then concatenating them could be optimized by creating a single string with the combined content.
- {Value{std::string{cRefTestString}}, {FilterOperation::EQ}}, - {Value{std::string{cRefTestString} + std::string{cRefTestString}}, - {FilterOperation::EQ}}, + {Value{std::string{cRefTestString}}, {FilterOperation::EQ}}, + {Value{std::string{cRefTestString.data(), cRefTestString.size() * 2}}, + {FilterOperation::EQ}},
310-321: Consider reducing code duplication.This pattern of checking for mismatched literal types and returning a predefined result is duplicated across multiple check functions. Consider extracting it to a helper function.
+ /** + * Helper function to check if the literal type matches and return false if not. + */ + [[nodiscard]] auto check_literal_type_mismatch( + FilterExpr const* filter_to_test, + LiteralType expected_type, + LiteralType actual_type + ) -> bool { + if (expected_type != actual_type) { + std::optional<Value> const empty_value_to_evaluate; + auto const actual_match_result{evaluate_filter_against_literal_type_value_pair( + filter_to_test, + actual_type, + empty_value_to_evaluate, + false + )}; + REQUIRE_FALSE(actual_match_result.has_error()); + return false == actual_match_result.value(); + } + return true; + }Then use it in each check function:
template <typename encoded_variable_t> requires std::is_same_v<encoded_variable_t, eight_byte_encoded_variable_t> || std::is_same_v<encoded_variable_t, four_byte_encoded_variable_t> auto check_filter_evaluation_against_encoded_text_asts( FilterExpr const* filter_to_test, LiteralType filter_operand_literal_type ) -> bool { - if (LiteralType::ClpStringT != filter_operand_literal_type) { - std::optional<Value> const empty_value_to_evaluate; - auto const actual_match_result{evaluate_filter_against_literal_type_value_pair( - filter_to_test, - filter_operand_literal_type, - empty_value_to_evaluate, - false - )}; - REQUIRE_FALSE(actual_match_result.has_error()); - return false - == actual_match_result.value(); + if (!check_literal_type_mismatch(filter_to_test, LiteralType::ClpStringT, filter_operand_literal_type)) { + return false; }
367-419: Consider simplifying validation logic pattern.The repeated pattern of checking return values against false followed by a WARN message and return statement could be simplified using a helper function or macro to reduce repetition and improve readability.
+ /** + * Helper macro to check function result and issue warning on failure. + */ + #define CHECK_EVAL_OR_WARN(evalFunc, failMsg) \ + if (false == (evalFunc)) { \ + WARN(failMsg); \ + return false; \ + } auto check_filter_evaluation(FilterExpr const* filter_to_test, LiteralType filter_operand_literal_type) -> bool { // ... existing code ... - if (false - == check_filter_evaluation_against_integral_values<value_int_t>( - filter_to_test, - filter_operand_literal_type - )) - { - WARN("Failed to evaluate filter against integers."); - return false; - } + CHECK_EVAL_OR_WARN( + check_filter_evaluation_against_integral_values<value_int_t>( + filter_to_test, + filter_operand_literal_type + ), + "Failed to evaluate filter against integers." + ); // ... apply to other similar blocks ...
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
components/core/tests/test-ffi_ir_stream_search_utils.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/tests/test-ffi_ir_stream_search_utils.cpp
⏰ Context from checks skipped due to timeout of 90000ms (7)
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: lint-check (ubuntu-latest)
🔇 Additional comments (2)
components/core/tests/test-ffi_ir_stream_search_utils.cpp (2)
261-261: The code follows the required coding guideline.The expression
false == filter_operandfollows the project's coding guideline to preferfalse == <expression>rather than!<expression>. This is consistent across the codebase.
425-505: The test coverage is comprehensive.The test case thoroughly covers all filter operation types (EXISTS, NEXISTS, EQ, NEQ, LT, LTE, GT, GTE) across all supported literal types (Integer, Float, Boolean, VarString, ClpString). The approach of using GENERATE_COPY to create test cases for each filter expression and type combination ensures good coverage while keeping the test code concise.
gibber9809
left a comment
There was a problem hiding this comment.
LGTM. PR title is fine as well.
kirkrodrigues
left a comment
There was a problem hiding this comment.
Some minor suggestions.
I think I'm missing some context about the TODOs for Literal::ObjectT. Could you explain briefly?
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
It's in this issue: #588 |
Generally both here and in clp-s search for empty objects E.g. for the object This github issue tracks the problem. |
kirkrodrigues
left a comment
There was a problem hiding this comment.
Reviewed basics. For the PR title, instead of "utility code", how about "utilities"?
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp (1)
1-1: Typo in file name reduces discoverabilityThe file is named
test_utilis.cpp; consider renaming totest_utils.cpp(ortest_utilities.cpp) to match the common spelling of “utilities”. This improves searchability and consistency with the rest of the codebase.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
components/core/CMakeLists.txt(1 hunks)components/core/src/clp/ffi/ir_stream/search/ErrorCode.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/utils.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/utils.hpp(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- components/core/src/clp/ffi/ir_stream/search/ErrorCode.cpp
- components/core/CMakeLists.txt
- components/core/src/clp/ffi/ir_stream/search/utils.hpp
- components/core/src/clp/ffi/ir_stream/search/utils.cpp
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp
🧬 Code Graph Analysis (1)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp (1)
components/core/src/clp_s/search/ast/ColumnDescriptor.hpp (1)
static_cast(241-241)
🪛 Cppcheck (2.10-2)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp
[error] 419-419: syntax error
(syntaxError)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: lint-check (ubuntu-latest)
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp (4)
429-432: Prefer direct boolean assertions for pointer checksThe current code uses
REQUIRE_FALSE((nullptr == ref_ptr))pattern for multiple pointer validity checks. While this works, consider using a more direct approach:-REQUIRE_FALSE((nullptr == ref_str_literal)); -REQUIRE_FALSE((nullptr == ref_bool_literal)); -REQUIRE_FALSE((nullptr == ref_int_literal)); -REQUIRE_FALSE((nullptr == ref_float_literal)); +REQUIRE(nullptr != ref_str_literal); +REQUIRE(nullptr != ref_bool_literal); +REQUIRE(nullptr != ref_int_literal); +REQUIRE(nullptr != ref_float_literal);This makes the assertion's intent clearer and eliminates unnecessary parentheses.
437-438: Consistent style with other pointer validity checksApply the same suggestion as with the other pointer checks:
-REQUIRE_FALSE((nullptr == column_descriptor)); +REQUIRE(nullptr != column_descriptor);
496-497: Consistent style with other pointer validity checksApply the same suggestion as with the other pointer checks:
-REQUIRE_FALSE((nullptr == filter_to_test)); +REQUIRE(nullptr != filter_to_test);
152-159: Consider adding additional bounds validationThe code uses
dict_var_boundsindices directly for string slicing. While there's a check that ensures there's an even number of bounds, consider adding a range check to ensure the bounds are within the text's limits before accessing them:REQUIRE(((dict_var_bounds.size() % 2) == 0)); std::vector<std::string> dict_vars; for (size_t i{0}; i < dict_var_bounds.size(); i += 2) { auto const begin_pos{static_cast<size_t>(dict_var_bounds[i])}; auto const end_pos{static_cast<size_t>(dict_var_bounds[i + 1])}; + REQUIRE(begin_pos <= end_pos); + REQUIRE(end_pos <= text.size()); dict_vars.emplace_back(text.cbegin() + begin_pos, text.cbegin() + end_pos); }This would catch potential issues with invalid boundaries.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp
🧬 Code Graph Analysis (1)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp (1)
components/core/src/clp_s/search/ast/ColumnDescriptor.hpp (1)
static_cast(241-241)
🪛 Cppcheck (2.10-2)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp
[error] 419-419: syntax error
(syntaxError)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: lint-check (ubuntu-latest)
- GitHub Check: lint-check (macos-latest)
🔇 Additional comments (4)
components/core/src/clp/ffi/ir_stream/search/test/test_utilis.cpp (4)
493-494: Code looks good, variable name fixedThe structured binding declaration correctly uses assignment syntax with
=and the previously identified typo in the variable name has been fixed correctly.
419-498: Well-structured test case with good coverageThe test case provides comprehensive coverage of filter evaluation against different literal types and operations. The test data setup is clear and the systematic generation of test cases ensures all combinations are tested. The code is well-structured and follows good testing practices.
🧰 Tools
🪛 Cppcheck (2.10-2)
[error] 419-419: syntax error
(syntaxError)
1-29: Comprehensive header inclusionAll necessary headers are included with a clean organization. The code follows good practices by including standard library headers first, followed by third-party and project-specific headers.
361-413: Good error reporting with descriptive warningsThe check_filter_evaluation function provides clear warning messages when filter evaluation fails, making test failures easier to understand and debug.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp (2)
61-64: Introduce a reusable concept to avoid duplicatedrequiresclausesThe same
std::is_same_v<encoded_variable_t, …>constraint appears twice (lines 61-64 and 145-146). Extracting it into a named concept simplifies the declaration and keeps the intent DRY.+#include <concepts> // already present + +template<typename T> +concept EncodedVariable = + std::same_as<T, eight_byte_encoded_variable_t> + || std::same_as<T, four_byte_encoded_variable_t>; -template <typename encoded_variable_t> -requires std::is_same_v<encoded_variable_t, eight_byte_encoded_variable_t> - || std::is_same_v<encoded_variable_t, four_byte_encoded_variable_t> +template <EncodedVariable encoded_variable_t>Apply the same
EncodedVariableconcept where the second duplicate constraint occurs.
This removes boiler-plate and makes future maintenance easier.Also applies to: 145-146
283-289: Consider widening negative-test coverage for var-stringsAt the moment only a prefix substring (
"te") is evaluated against theNEQoperation. Adding a couple of additional non-matching scenarios (e.g., an empty string and a completely unrelated token) would strengthen the assertion that wildcard handling behaves as expected for all corner cases.This is strictly optional, but helps future contributors understand the intended semantics.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
components/core/CMakeLists.txt(1 hunks)components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- components/core/CMakeLists.txt
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp
🧬 Code Graph Analysis (1)
components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp (1)
components/core/src/clp_s/search/ast/ColumnDescriptor.hpp (1)
static_cast(241-241)
🪛 Cppcheck (2.10-2)
components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp
[error] 419-419: syntax error
(syntaxError)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: lint-check (macos-latest)
- GitHub Check: lint-check (ubuntu-latest)
🔇 Additional comments (1)
components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp (1)
419-498: Great test coverage!The generated test matrix systematically walks through all filter operations and literal types, providing solid confidence in the new evaluation utilities. Nice use of
GENERATE_COPYto keep the code concise.No issues found here.
🧰 Tools
🪛 Cppcheck (2.10-2)
[error] 419-419: syntax error
(syntaxError)
Co-authored-by: Devin Gibson <gibber9809@users.noreply.github.com> Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Description
This is one of PR in series #840.
We will reuse the search AST in clp-s to implement kv-pair IR stream search. However, kv-pair IR stream has a different schema tree implementation than clp-s, meaning we will need some special handling for kv-pair IR values and schema tree nodes. This PR introduces the relevant utilities for:
Checklist
breaking change.
Validation performed
Summary by CodeRabbit