feat(kv-ir): Integrate QueryHandler into the IR stream Deserializer.#890
Conversation
|
Note Reviews pausedUse the following commands to manage reviews:
WalkthroughThis change introduces a comprehensive query evaluation mechanism for IR stream log events using key-value pairs. It adds a new query handler requirement abstraction, implements AST-based query evaluation with bitmask result semantics, and updates the deserializer to optionally use a query handler for filtering and projection. Error codes and test utilities are updated to support the new logic. Changes
Sequence Diagram(s)sequenceDiagram
participant Reader
participant Deserializer
participant QueryHandler (optional)
participant UserHandler
Reader->>Deserializer: Provide IR stream
Deserializer->>Reader: Read next IR unit
alt LogEvent IR unit
Deserializer->>QueryHandler: Evaluate log event (if QueryHandler is not Empty)
alt Query matches (True)
Deserializer->>UserHandler: Invoke handler with log event
else Query does not match (False/Pruned)
Deserializer-->>UserHandler: Skip handler invocation
end
else SchemaTreeNodeInsertion IR unit
Deserializer->>QueryHandler: update_partially_resolved_columns (if QueryHandler is not Empty)
Deserializer->>UserHandler: Invoke handler with schema node
else Other IR unit
Deserializer->>UserHandler: Invoke handler as usual
end
sequenceDiagram
participant QueryHandlerImpl
participant AST
participant LogEvent
QueryHandlerImpl->>AST: Start DFS traversal
loop For each AST node
alt FilterExpr
QueryHandlerImpl->>LogEvent: Evaluate filter against node-id-value pairs
QueryHandlerImpl-->>AST: Return evaluation result
else AndExpr/OrExpr
QueryHandlerImpl->>AST: Traverse child expressions (stack-based)
QueryHandlerImpl-->>AST: Combine results (bitmask logic)
end
end
QueryHandlerImpl-->>Caller: Return final AstEvaluationResult
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
@coderabbitai pause |
|
Tip For best results, initiate chat on the files or code changes. @LinZhihao-723 I noticed you're trying to use the "stop" command, but it's not recognized. If you want to pause the review process, please use Here are some other helpful commands:
Is there something specific you'd like me to help with regarding this PR? |
There was a problem hiding this comment.
@LinZhihao-723 I've reviewed the code in QueryHandlerReq.hpp and it looks good! The code is:
- Well-structured with clear namespace usage
- Properly documented with informative comments
- Using modern C++ features like concepts and type traits appropriately
- Following the coding guidelines (no instances of
!<expression>wherefalse == <expression>would be preferred)
The template metaprogramming approach with IsQueryHandler trait and QueryHandlerReq concept is a clean way to enforce type constraints. No issues to report!
|
@coderabbitai pause |
There was a problem hiding this comment.
No description provided.
There was a problem hiding this comment.
Actionable comments posted: 13
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (12)
components/core/CMakeLists.txt(1 hunks)components/core/src/clp/ffi/ir_stream/Deserializer.hpp(8 hunks)components/core/src/clp/ffi/ir_stream/search/AstEvaluationResult.hpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/ErrorCode.cpp(2 hunks)components/core/src/clp/ffi/ir_stream/search/ErrorCode.hpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/QueryHandler.hpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp(5 hunks)components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp(7 hunks)components/core/src/clp/ffi/ir_stream/search/QueryHandlerReq.hpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/test/test_QueryHandlerImpl.cpp(8 hunks)components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/test/utils.hpp(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/src/clp/ffi/ir_stream/search/ErrorCode.cppcomponents/core/src/clp/ffi/ir_stream/search/ErrorCode.hppcomponents/core/src/clp/ffi/ir_stream/search/QueryHandler.hppcomponents/core/src/clp/ffi/ir_stream/search/test/test_utils.cppcomponents/core/src/clp/ffi/ir_stream/search/test/utils.hppcomponents/core/src/clp/ffi/ir_stream/search/AstEvaluationResult.hppcomponents/core/src/clp/ffi/ir_stream/search/QueryHandlerReq.hppcomponents/core/src/clp/ffi/ir_stream/Deserializer.hppcomponents/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hppcomponents/core/src/clp/ffi/ir_stream/search/test/test_QueryHandlerImpl.cppcomponents/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp
🪛 Cppcheck (2.10-2)
components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp
[style] 428-428: The function 'evaluate_kv_pair_log_event' is never used.
(unusedFunction)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: lint-check (macos-latest)
- GitHub Check: lint-check (ubuntu-latest)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-14, true)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-14, false)
🔇 Additional comments (14)
components/core/CMakeLists.txt (1)
464-464: Addition of QueryHandlerReq.hpp looks goodThe inclusion of the new header file aligns with the PR objectives to support query handling capabilities in the IR stream deserializer.
components/core/src/clp/ffi/ir_stream/search/ErrorCode.hpp (1)
14-15: New error codes for AST evaluation look goodThe addition of these three error codes properly supports the enhanced AST evaluation framework mentioned in the PR objectives.
Also applies to: 20-20
components/core/src/clp/ffi/ir_stream/search/ErrorCode.cpp (1)
35-36: Error message for ExpressionTypeUnexpected looks goodThe message is concise and clear.
components/core/src/clp/ffi/ir_stream/search/AstEvaluationResult.hpp (2)
11-18: Good design for bitmask operations with enum valuesThe change from enum class to plain enum with explicit bitmask values enables efficient bitwise operations without explicit casting. This aligns well with the PR objective to implement AST-based query evaluation with bitmask result semantics.
20-20: Helpful type alias for bitmask operationsThe introduction of
AstEvaluationResultBitmaskprovides a clear type for working with the bitmask operations throughout the codebase.components/core/src/clp/ffi/ir_stream/search/test/test_utils.cpp (1)
25-25: Refactoring of utility function looks good.You've relocated the
get_encoded_text_astfunction to a header file, which will make it available inline where needed. This is a good approach for templated utility functions that might be used across multiple test files.components/core/src/clp/ffi/ir_stream/search/QueryHandler.hpp (1)
92-102: API update improves interface cohesion.The change from
evaluate_node_id_value_pairstoevaluate_kv_pair_log_eventprovides a cleaner interface by accepting a complete log event instead of separate pairs. The method now forwards to the corresponding implementation inQueryHandlerImpl.components/core/src/clp/ffi/ir_stream/search/QueryHandlerReq.hpp (1)
1-44: Design looks good with appropriate type traits and concepts.This new file provides a clean abstraction for query handler requirements:
EmptyQueryHandlerserves as a no-op handler for when query evaluation isn't neededIsQueryHandlertrait properly identifies valid handler typesQueryHandlerReqconcept correctly constrains the allowed typesThe implementation follows modern C++ practices with type traits and concepts.
components/core/src/clp/ffi/ir_stream/search/test/utils.hpp (2)
96-126: Implementation of templated utility function looks good.The templated
get_encoded_text_astfunction provides a clean way to generate test data for encoded text ASTs with appropriate template constraints. The code correctly parses the input text, extracts dictionary variables based on bounds, and constructs theEncodedTextAstwith the resulting data.
116-116: 🧹 Nitpick (assertive)Consider adapting comparison style to match coding guidelines.
According to the coding guidelines, the preferred style is
false == <expression>rather than!<expression>. By extension, it might be preferable to use the following style for modulo checks:- REQUIRE(((dict_var_bounds.size() % 2) == 0)); + REQUIRE((0 == (dict_var_bounds.size() % 2)));Likely an incorrect or invalid review comment.
components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.cpp (4)
157-158: LGTM: Good fix to handle null query.The change correctly returns the query directly instead of a null return, which completes the early return path in the function.
345-376: Good implementation of filter evaluation against node-value pairs.The function correctly handles different evaluation scenarios including type matching, error handling, and conversion between different evaluation result enums.
452-484: Well-implemented AST iterator creation.The method correctly handles different expression types (AndExpr, OrExpr, FilterExpr) and properly initializes the iterator state.
574-590: Good handling of expression inversion in evaluation results.The code correctly applies inverted evaluation logic when an expression is marked as inverted, ensuring proper propagation of evaluation results.
| case ErrorCodeEnum::AstEvaluationInvariantViolation: | ||
| return "Internal invariant violated during AST evaluation. This indicates a serious " | ||
| "bug in the evaluation logic."; | ||
| case ErrorCodeEnum::AttemptToIterateAstLeafExpr: | ||
| return "Attempted to iterate an leaf expression of an AST."; |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Fixed grammar in error message
There's a minor grammatical error in the message for AttemptToIterateAstLeafExpr.
case ErrorCodeEnum::AttemptToIterateAstLeafExpr:
- return "Attempted to iterate an leaf expression of an AST.";
+ return "Attempted to iterate a leaf expression of an AST.";| struct IsQueryHandler<QueryHandler<NewProjectedSchemaTreeNodeCallbackType>> : std::true_type {}; | ||
|
|
||
| /** | ||
| * Requirements for a query handler that can be used with`clp::ffi::ir_stream::Deserializer`. |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Minor typo in documentation.
There's a missing space between "with" and "clp::ffi::ir_stream::Deserializer".
-/**
- * Requirements for a query handler that can be used with`clp::ffi::ir_stream::Deserializer`.
+/**
+ * Requirements for a query handler that can be used with `clp::ffi::ir_stream::Deserializer`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * Requirements for a query handler that can be used with`clp::ffi::ir_stream::Deserializer`. | |
| /** | |
| * Requirements for a query handler that can be used with `clp::ffi::ir_stream::Deserializer`. |
| auto push_ast_dfs_stack(AstExprIterator ast_expr_it) -> void { | ||
| m_ast_dfs_stack.emplace_back(ast_expr_it, AstEvaluationResultBitmask{}); | ||
| } | ||
|
|
||
| /** | ||
| * Pops the AST DFS stack and update the evaluation accordingly: | ||
| * - If the stack if not empty, update the parent's evaluation results. | ||
| * - Otherwise, update `query_evaluation_results`. | ||
| * @param evaluation_result | ||
| * @param query_evaluation_result Returns the query evaluation result. | ||
| */ | ||
| auto pop_ast_dfs_stack_and_update_evaluation_results( | ||
| AstEvaluationResult evaluation_result, | ||
| std::optional<AstEvaluationResult>& query_evaluation_result | ||
| ) -> void; |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Potential stack growth without upper bound – consider reserving or recycling
push_ast_dfs_stack() blindly emplace_back()s a pair for every DFS step.
For deeply nested queries this can lead to repeated reallocations or even uncontrolled growth if cycles were ever introduced by malformed ASTs.
m_ast_dfs_stack.reserve(<estimated depth>)in the constructor (e.g., number of nodes in the AST) amortises reallocations.- Alternatively recycle the last vector slot after
pop_ast_dfs_stack_and_update_evaluation_results()to keep memory footprint constant.
Not mission-critical, but a worthwhile optimisation given query depth can be user-controlled.
| /** | ||
| * Evaluates the filter expression against the given kv-pair log event. | ||
| * @param filter_expr | ||
| * @param log_event | ||
| * @return A result containing the evaluation result on success, or an error code indicating the | ||
| * failure: | ||
| * - ErrorCodeEnum::AstEvaluationInvariantViolation if the underlying column of the filter is | ||
| * neither user-generated nor auto-generated. | ||
| * - Forwards `evaluate_wildcard_filter`'s return values. | ||
| * - Forwards `evaluate_filter_against_node_id_value_pair`'s return values. | ||
| */ | ||
| [[nodiscard]] auto evaluate_filter_expr( | ||
| clp_s::search::ast::FilterExpr* filter_expr, | ||
| KeyValuePairLogEvent const& log_event | ||
| ) -> outcome_v2::std_result<AstEvaluationResult>; | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Mixed negation styles – follow project guideline for readability
Inside this file both forms of negation appear:
false == token_it.is_last()
!token_it.is_last()
The repository guideline (coding_guidelines tag) asks to prefer false == <expression> over !<expression>.
For consistency (and smoother future grep-based audits) please replace the !expr occurrences in this block – and elsewhere in the header – with the preferred style.
| /** | ||
| * @return The underlying expression as `clp_s::search::ast::FilterExpr` if the underlying | ||
| * type matches, or nullptr otherwise. | ||
| */ | ||
| [[nodiscard]] auto as_filter_expr() const -> clp_s::search::ast::FilterExpr* { | ||
| if (std::holds_alternative<clp_s::search::ast::FilterExpr*>(m_expr)) { | ||
| return std::get<clp_s::search::ast::FilterExpr*>(m_expr); | ||
| } | ||
| return nullptr; | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Ensure consistent const-correctness across as_*_expr() helpers
as_and_expr() and as_or_expr() both return const*, preserving immutability of the underlying AST nodes.
as_filter_expr() on the other hand returns a non-const pointer, which is inconsistent and allows callers to mutate the AST while iterating – something that is almost never intended during query evaluation.
-[[nodiscard]] auto as_filter_expr() const -> clp_s::search::ast::FilterExpr* {
+[[nodiscard]] auto as_filter_expr() const -> clp_s::search::ast::FilterExpr const* {If write-access is genuinely required, provide a dedicated non-const accessor; otherwise make it const for safety and API symmetry.
| if (false == optional_evaluation_result.has_value()) { | ||
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Add comment explaining the invariant violation.
The code checks if evaluation_result has a value but doesn't explain why it might be missing, making it difficult to understand the error condition.
if (false == optional_evaluation_result.has_value()) {
+ // This should never happen if the AST evaluation logic is correct
+ // It would indicate a bug in the evaluation code
return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation};
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (false == optional_evaluation_result.has_value()) { | |
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | |
| } | |
| if (false == optional_evaluation_result.has_value()) { | |
| // This should never happen if the AST evaluation logic is correct | |
| // It would indicate a bug in the evaluation code | |
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | |
| } |
| auto const* or_expr{expr_it.as_or_expr()}; | ||
| if (nullptr == or_expr) { | ||
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Improve error message for the invariant violation.
The error code doesn't specify which invariant was violated. A more specific error would aid in debugging.
auto const* or_expr{expr_it.as_or_expr()};
if (nullptr == or_expr) {
- return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation};
+ return ErrorCode{ErrorCodeEnum::ExpressionTypeUnexpected};
}| if (false == m_resolved_column_to_schema_tree_node_ids.contains(col)) { | ||
| return AstEvaluationResult::Pruned; | ||
| } | ||
|
|
||
| auto const optional_is_auto_gen{is_auto_generated(col->get_namespace())}; | ||
| if (false == optional_is_auto_gen.has_value()) { | ||
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Add comment explaining the unrecognized namespace case.
This section checks if the namespace is recognized and returns an error if not. Adding a comment about why this case should never occur would help maintainability.
if (false == optional_is_auto_gen.has_value()) {
+ // This should never happen - we should only evaluate columns with recognized namespaces
+ // If this occurs, it indicates a bug in the column resolution logic
return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation};
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (false == m_resolved_column_to_schema_tree_node_ids.contains(col)) { | |
| return AstEvaluationResult::Pruned; | |
| } | |
| auto const optional_is_auto_gen{is_auto_generated(col->get_namespace())}; | |
| if (false == optional_is_auto_gen.has_value()) { | |
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | |
| } | |
| auto const optional_is_auto_gen{is_auto_generated(col->get_namespace())}; | |
| if (false == optional_is_auto_gen.has_value()) { | |
| // This should never happen - we should only evaluate columns with recognized namespaces | |
| // If this occurs, it indicates a bug in the column resolution logic | |
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | |
| } |
| AstEvaluationResultBitmask evaluation_results{}; | ||
| for (auto const matchable_node_id : matchable_node_ids) { | ||
| if (false == node_id_value_pairs.contains(matchable_node_id)) { | ||
| continue; | ||
| } | ||
| auto const evaluation_result{OUTCOME_TRYX(evaluate_filter_against_node_id_value_pair( | ||
| filter_expr, | ||
| matchable_node_id, | ||
| node_id_value_pairs.at(matchable_node_id), | ||
| schema_tree, | ||
| m_case_sensitive_match | ||
| ))}; | ||
| if (AstEvaluationResult::True == evaluation_result) { | ||
| return AstEvaluationResult::True; | ||
| } | ||
| evaluation_results |= evaluation_result; | ||
| } | ||
|
|
||
| if ((evaluation_results & AstEvaluationResult::False) != 0) { | ||
| return AstEvaluationResult::False; | ||
| } | ||
| return AstEvaluationResult::Pruned; |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Consider refactoring common evaluation pattern.
This evaluation pattern appears multiple times in the code (also in evaluate_wildcard_filter). Consider extracting it to a helper method to avoid duplication.
+ [[nodiscard]] auto evaluate_node_ids_against_filter(
+ clp_s::search::ast::FilterExpr* filter_expr,
+ std::vector<SchemaTree::Node::id_t> const& matchable_node_ids,
+ KeyValuePairLogEvent::NodeIdValuePairs const& node_id_value_pairs,
+ SchemaTree const& schema_tree,
+ bool case_sensitive_match
+ ) -> outcome_v2::std_result<AstEvaluationResult> {
+ AstEvaluationResultBitmask evaluation_results{};
+ for (auto const matchable_node_id : matchable_node_ids) {
+ if (false == node_id_value_pairs.contains(matchable_node_id)) {
+ continue;
+ }
+ auto const evaluation_result{OUTCOME_TRYX(evaluate_filter_against_node_id_value_pair(
+ filter_expr,
+ matchable_node_id,
+ node_id_value_pairs.at(matchable_node_id),
+ schema_tree,
+ case_sensitive_match
+ ))};
+ if (AstEvaluationResult::True == evaluation_result) {
+ return AstEvaluationResult::True;
+ }
+ evaluation_results |= evaluation_result;
+ }
+
+ if ((evaluation_results & AstEvaluationResult::False) != 0) {
+ return AstEvaluationResult::False;
+ }
+ return AstEvaluationResult::Pruned;
+ }Then use it in evaluate_filter_expr:
- AstEvaluationResultBitmask evaluation_results{};
- for (auto const matchable_node_id : matchable_node_ids) {
- if (false == node_id_value_pairs.contains(matchable_node_id)) {
- continue;
- }
- auto const evaluation_result{OUTCOME_TRYX(evaluate_filter_against_node_id_value_pair(
- filter_expr,
- matchable_node_id,
- node_id_value_pairs.at(matchable_node_id),
- schema_tree,
- m_case_sensitive_match
- ))};
- if (AstEvaluationResult::True == evaluation_result) {
- return AstEvaluationResult::True;
- }
- evaluation_results |= evaluation_result;
- }
-
- if ((evaluation_results & AstEvaluationResult::False) != 0) {
- return AstEvaluationResult::False;
- }
- return AstEvaluationResult::Pruned;
+ return evaluate_node_ids_against_filter(
+ filter_expr,
+ matchable_node_ids,
+ node_id_value_pairs,
+ schema_tree,
+ m_case_sensitive_match
+ );📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| AstEvaluationResultBitmask evaluation_results{}; | |
| for (auto const matchable_node_id : matchable_node_ids) { | |
| if (false == node_id_value_pairs.contains(matchable_node_id)) { | |
| continue; | |
| } | |
| auto const evaluation_result{OUTCOME_TRYX(evaluate_filter_against_node_id_value_pair( | |
| filter_expr, | |
| matchable_node_id, | |
| node_id_value_pairs.at(matchable_node_id), | |
| schema_tree, | |
| m_case_sensitive_match | |
| ))}; | |
| if (AstEvaluationResult::True == evaluation_result) { | |
| return AstEvaluationResult::True; | |
| } | |
| evaluation_results |= evaluation_result; | |
| } | |
| if ((evaluation_results & AstEvaluationResult::False) != 0) { | |
| return AstEvaluationResult::False; | |
| } | |
| return AstEvaluationResult::Pruned; | |
| return evaluate_node_ids_against_filter( | |
| filter_expr, | |
| matchable_node_ids, | |
| node_id_value_pairs, | |
| schema_tree, | |
| m_case_sensitive_match | |
| ); |
| auto QueryHandlerImpl::evaluate_kv_pair_log_event(KeyValuePairLogEvent const& log_event) | ||
| -> outcome_v2::std_result<AstEvaluationResult> { | ||
| if (nullptr == m_query) { | ||
| return AstEvaluationResult::True; | ||
| } | ||
|
|
||
| if (m_is_empty_query) { | ||
| return AstEvaluationResult::False; | ||
| } | ||
|
|
||
| std::optional<AstEvaluationResult> optional_evaluation_result; | ||
| m_ast_dfs_stack.clear(); | ||
| push_ast_dfs_stack(OUTCOME_TRYX(AstExprIterator::create(m_query.get()))); | ||
| while (false == m_ast_dfs_stack.empty()) { | ||
| OUTCOME_TRYV(advance_ast_dfs_evaluation(log_event, optional_evaluation_result)); | ||
| } | ||
|
|
||
| if (false == optional_evaluation_result.has_value()) { | ||
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | ||
| } | ||
|
|
||
| return optional_evaluation_result.value(); | ||
| } |
There was a problem hiding this comment.
💡 Verification agent
🧩 Analysis chain
Verify usage of evaluate_kv_pair_log_event.
The static analysis hints indicate this function is never used. However, based on the PR context, this function should be used by the deserializer to evaluate log events against a query. Please verify that this function is correctly integrated with the IR stream deserializer.
🏁 Script executed:
#!/bin/bash
# Search for references to evaluate_kv_pair_log_event in the codebase
rg "evaluate_kv_pair_log_event" -A 2 -B 2Length of output: 3702
🏁 Script executed:
#!/bin/bash
# Inspect the implementation of handle_log_event in the IR stream deserializer
rg -C 5 "handle_log_event" components/core/src/clp/ffi/ir_stream/Deserializer.cppLength of output: 176
🏁 Script executed:
#!/bin/bash
# Locate all Deserializer source files in the IR stream directory
fd -HI 'Deserializer.*' -t f components/core/src/clp/ffi/ir_stream
# Search for handle_log_event in any extension under that directory
rg -C 3 "handle_log_event" components/core/src/clp/ffi/ir_streamLength of output: 3123
🏁 Script executed:
#!/bin/bash
# Show context around query handler usage in the deserializer
rg -C 10 "m_query_handler" components/core/src/clp/ffi/ir_stream/Deserializer.hpp
# Show deserialization and evaluation steps for KV pair log events
rg -C 10 "deserialize_ir_unit_kv_pair_log_event" components/core/src/clp/ffi/ir_stream/Deserializer.hppLength of output: 7201
🏁 Script executed:
#!/bin/bash
# Find all evaluate-related methods in QueryHandler.hpp
rg -n "evaluate" components/core/src/clp/ffi/ir_stream/search/QueryHandler.hpp
# Show the QueryHandler class definition to inspect evaluate() overloads
sed -n '1,200p' components/core/src/clp/ffi/ir_stream/search/QueryHandler.hppLength of output: 4958
🏁 Script executed:
#!/bin/bash
# Search for EmptyQueryHandler implementation
rg -R "struct EmptyQueryHandler" -A 5 -B 5 components/core/src/clp/ffi/ir_stream/searchLength of output: 494
🏁 Script executed:
#!/bin/bash
# Inspect QueryHandlerImpl for evaluate() overloads
rg -n "auto evaluate" -C 5 components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp
# Search for any evaluate(KeyValuePairLogEvent) definitions
rg -n "evaluate(" -C 5 components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp
# Locate the definition of the QueryHandlerReq concept
rg -n "QueryHandlerReq" -C 5 components/core/src/clp/ffi/ir_stream/searchLength of output: 2908
🏁 Script executed:
#!/bin/bash
# Search QueryHandlerImpl.hpp for any evaluate() methods (literal 'evaluate(')
rg -F "evaluate(" -C 5 components/core/src/clp/ffi/ir_stream/search/QueryHandlerImpl.hpp
# Search QueryHandler wrapper for evaluate() shorthand
rg -F "evaluate(" -C 5 components/core/src/clp/ffi/ir_stream/search/QueryHandler.hpp
# Locate the IsQueryHandler trait definition to see expected interface
rg -R "IsQueryHandler" -C 5 components/core/src/clp/ffi/ir_stream/searchLength of output: 655
Ensure evaluate_kv_pair_log_event is actually invoked by the deserializer
The evaluate_kv_pair_log_event method in QueryHandlerImpl is never called: in the deserializer you’re invoking a non-existent m_query_handler.evaluate(…) instead of the intended evaluate_kv_pair_log_event. Please update the call site so that log events are filtered by the query.
• File: components/core/src/clp/ffi/ir_stream/Deserializer.hpp
- Around the
IrUnitType::LogEventcase, change:- if (search::AstEvaluationResult::True - != OUTCOME_TRYX(m_query_handler.evaluate(result.value()))) + if (search::AstEvaluationResult::True + != OUTCOME_TRYX(m_query_handler.evaluate_kv_pair_log_event(result.value())))
• Confirm that EmptyQueryHandler still compiles (it need only satisfy the QueryHandlerReq concept).
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| auto QueryHandlerImpl::evaluate_kv_pair_log_event(KeyValuePairLogEvent const& log_event) | |
| -> outcome_v2::std_result<AstEvaluationResult> { | |
| if (nullptr == m_query) { | |
| return AstEvaluationResult::True; | |
| } | |
| if (m_is_empty_query) { | |
| return AstEvaluationResult::False; | |
| } | |
| std::optional<AstEvaluationResult> optional_evaluation_result; | |
| m_ast_dfs_stack.clear(); | |
| push_ast_dfs_stack(OUTCOME_TRYX(AstExprIterator::create(m_query.get()))); | |
| while (false == m_ast_dfs_stack.empty()) { | |
| OUTCOME_TRYV(advance_ast_dfs_evaluation(log_event, optional_evaluation_result)); | |
| } | |
| if (false == optional_evaluation_result.has_value()) { | |
| return ErrorCode{ErrorCodeEnum::AstEvaluationInvariantViolation}; | |
| } | |
| return optional_evaluation_result.value(); | |
| } | |
| --- a/components/core/src/clp/ffi/ir_stream/Deserializer.hpp | |
| +++ b/components/core/src/clp/ffi/ir_stream/Deserializer.hpp | |
| @@ /* around the IrUnitType::LogEvent case */ | |
| - if (search::AstEvaluationResult::True | |
| - != OUTCOME_TRYX(m_query_handler.evaluate(result.value()))) | |
| + if (search::AstEvaluationResult::True | |
| + != OUTCOME_TRYX(m_query_handler.evaluate_kv_pair_log_event(result.value()))) | |
| { | |
| // existing handling for non-matching events | |
| } |
🧰 Tools
🪛 Cppcheck (2.10-2)
[style] 428-428: The function 'evaluate_kv_pair_log_event' is never used.
(unusedFunction)
✅ Actions performedReviews paused. |
1 similar comment
✅ Actions performedReviews paused. |
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
There was a problem hiding this comment.
Also added test cases for deserializer integration with queries.
Can u help review this? @kirkrodrigues
| auto deserializer_result{ | ||
| make_deserializer(buf_reader, IrUnitHandler{}, std::move(query_handler_result.value())) | ||
| }; |
There was a problem hiding this comment.
fyi, this is what I mean by #890 (comment).
The type is auto-deduced: no need to explicitly state the type of unit handler and query handler.
@kirkrodrigues
| constexpr std::string_view cObjKey{"obj_key"}; | ||
| constexpr std::string_view cUnresolvableKey{"unresolvable_key"}; | ||
| constexpr int cIntBase{INT16_MAX}; | ||
| constexpr std::string_view cTestSubStr{"test"}; |
There was a problem hiding this comment.
- Why substring instead of string?
- Substring is usually one word, right? So then it should be
cTestSubstr.
There was a problem hiding this comment.
- It's a substring since:
- The actual string is not exactly
"test"but a string that contains"test" - We wrap it with two wildcards to generate the string query for testing.
- The actual string is not exactly
- True, true. I thought it's
sub-stringorsub string. mb
| CAPTURE(ir_stream_bytes.size()); | ||
|
|
||
| auto const wildcard_str_query{fmt::format("*{}*", cTestSubStr)}; | ||
| auto const [kql_query_str, expected_json_pairs] = GENERATE_COPY( |
There was a problem hiding this comment.
Did you follow some process to come up with these test cases or were they picked at random?
There was a problem hiding this comment.
Not randomly picked. Based on the JSON pairs input, I created the following:
- String search on non-wildcard columns only matches the flattened ones.
- String search on wildcard columns matches all.
- Integer comparison on non-wildcard columns only matches the flattened ones.
- Integer comparison on wildcard columns also matches inner keys.
- A combination of
ANDon auto-gen and user-gen columns (results should be equivalent to 3). - A combination of
ORon auto-gen and user-gen columns (matches all) - A combination of
ANDwithNOTon auto-gen and user-gen columns (results should be equivalent to 5) - Top-level
NOTwithANDon auto-gen and user-gen columns that matches all. - Unresolvable column that matches nothing.
Search features covered:
- Column resolutions on wildcards
- Column resolutions on auto-gen/user-gen namespaces
- Basic
AND/ORlogic - Inverter logic (
NOT) on filter - Inverter logic (
NOT) on the top - Unresolvable columns
In the next push I will also include a simple case to ensure unresolvable columns are not invertible.
I understand that not all supported types (e.g., boolean and float) and filter operations are covered, but they're tested in query hander's unit tests.
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
feat(kv-ir): Integrate `QueryHandler` into the IR stream `Deserializer`.
QueryHandler into IR stream deserializer.QueryHandler into the IR stream Deserializer.
…r`. (y-scope#890) Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Description
References
QueryHandler'sevaluate_kv_pair_log_event. #882.Overview
This PR integrates
QueryHandlerintoDeserializer.QueryHandleris added as a templated parameter toDeserializer, where it can be an instance ofclp::ffi::ir_stream::search::QueryHandleras introduced in the previous PRs in this series, orclp::ffi::ir_stream::search::EmptyQueryHandler(an empty class) to perform no query handling. This design ensures the type/functionality of a deserializer is determined statically: it's either designed to read the entire stream, or is configured to read the stream with KQL query filtering or column projections.The factory function of the deserializer is templated to have two different versions:
The C++ typing system will treat the deserializers above as different types. However, if users use the
autokeyword for creation, the type can be automatically deduced according to the chosen factory function.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Tests
Chores