feat(clp-s): Add support for searching KV-pair IR streams through the clp-s cli.#909
Conversation
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
WalkthroughThis change introduces a key-value IR stream search feature to the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant clp-s (main)
participant search_kv_ir_stream
participant StreamReader
participant ZstdDecompressor
participant Deserializer
participant IrUnitHandler
User->>clp-s (main): Provide input file(s)
clp-s (main)->>clp-s (main): Detect IR file extension
alt Is IR file
clp-s (main)->>search_kv_ir_stream: Attempt IR stream search
search_kv_ir_stream->>StreamReader: Open input path
search_kv_ir_stream->>ZstdDecompressor: Wrap stream reader
search_kv_ir_stream->>Deserializer: Create and start deserialization
Deserializer->>IrUnitHandler: Process IR units (log events, etc.)
IrUnitHandler->>clp-s (main): Output results to stdout
Deserializer-->>search_kv_ir_stream: Return status
search_kv_ir_stream-->>clp-s (main): Return outcome
alt Success
clp-s (main)->>clp-s (main): Continue to next input
else Truncated or unsupported
clp-s (main)->>clp-s (main): Log warning, fallback to archive search
else Fatal error
clp-s (main)->>clp-s (main): Log error, exit
end
else Not IR file
clp-s (main)->>clp-s (main): Proceed with archive search
end
Possibly related PRs
Suggested reviewers
Note ⚡️ AI Code Reviews for VS Code, Cursor, WindsurfCodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback. Note ⚡️ Faster reviews with cachingCodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 30th. To opt out, configure 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (1)`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
⏰ Context from checks skipped due to timeout of 90000ms (10)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 14
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (11)
components/core/CMakeLists.txt(1 hunks)components/core/src/clp/ffi/ir_stream/Deserializer.hpp(9 hunks)components/core/src/clp/ffi/ir_stream/search/QueryHandlerReq.hpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/test/test_deserializer_integration.cpp(1 hunks)components/core/src/clp/ffi/ir_stream/search/test/utils.hpp(3 hunks)components/core/src/clp_s/CMakeLists.txt(4 hunks)components/core/src/clp_s/clp-s.cpp(4 hunks)components/core/src/clp_s/kv_ir_search.cpp(1 hunks)components/core/src/clp_s/kv_ir_search.hpp(1 hunks)components/core/tests/test-ir_encoding_methods.cpp(2 hunks)taskfiles/lint.yaml(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/tests/test-ir_encoding_methods.cppcomponents/core/src/clp/ffi/ir_stream/search/QueryHandlerReq.hppcomponents/core/src/clp_s/kv_ir_search.hppcomponents/core/src/clp_s/clp-s.cppcomponents/core/src/clp_s/kv_ir_search.cppcomponents/core/src/clp/ffi/ir_stream/search/test/test_deserializer_integration.cppcomponents/core/src/clp/ffi/ir_stream/search/test/utils.hppcomponents/core/src/clp/ffi/ir_stream/Deserializer.hpp
🧠 Learnings (1)
components/core/tests/test-ir_encoding_methods.cpp (1)
Learnt from: LinZhihao-723
PR: y-scope/clp#570
File: components/core/tests/test-ir_encoding_methods.cpp:376-399
Timestamp: 2024-11-01T03:26:26.386Z
Learning: In the test code (`components/core/tests/test-ir_encoding_methods.cpp`), exception handling for `msgpack::unpack` can be omitted because the Catch2 testing framework captures exceptions if they occur.
🧬 Code Graph Analysis (1)
components/core/tests/test-ir_encoding_methods.cpp (1)
components/core/src/clp/ffi/ir_stream/search/test/utils.hpp (3)
unpack_and_serialize_msgpack_bytes(122-126)unpack_and_serialize_msgpack_bytes(149-186)unpack_and_serialize_msgpack_bytes(149-153)
🪛 Cppcheck (2.10-2)
components/core/src/clp_s/kv_ir_search.cpp
[error] 245-245: Syntax Error
(internalAstError)
components/core/src/clp/ffi/ir_stream/search/test/test_deserializer_integration.cpp
[error] 245-245: Syntax Error
(internalAstError)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: build-macos (macos-15, false)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-14, true)
🔇 Additional comments (8)
taskfiles/lint.yaml (2)
168-170: Appropriate suppression of known msgpack-c clang-tidy warningThe
-line-filterentry excludescpp11_zone.hpp:197, working around msgpack-c issue #1098. This prevents spurious warnings during the diff-based analysis.
756-758: Consistent lint suppression across full analysisMirroring the same
-line-filterincheck-cpp-static-fullensures both diff and full clang-tidy runs remain aligned and ignore the same msgpack-c issue.components/core/src/clp_s/CMakeLists.txt (1)
259-260: Linking againstystdlib::error_handlinglooks correctNice catch adding the new target – this guarantees the category singleton used by
KvIrSearchErroris properly linked.components/core/src/clp/ffi/ir_stream/search/QueryHandlerReq.hpp (1)
41-45: Concept is well-defined and future-proofThe
QueryHandlerReqconcept cleanly captures the intended constraints (empty handler OR recognised handler + move-constructible). This will make downstream template diagnostics far clearer than relying on SFINAE alone.components/core/src/clp_s/kv_ir_search.hpp (1)
24-45: VerifyPathis in scope
Pathis used in the function signature but is not declared in this header. It is probably brought in viaInputConfig.hpp, yet relying on transitive includes makes the header fragile.Please either
#include "Path.hpp"(or whichever header owns the alias), or- forward-declare it (
class Path;) if inclusion would form a cycle.This prevents accidental breakage when include order changes.
components/core/src/clp_s/kv_ir_search.cpp (2)
130-149: Good guard against unsupported output modesFail-fast validation inside
IrUnitHandler::createis clear and emits actionable log messages.
244-255: Static analysis “syntax error” is likely a false positiveCppcheck flagged a syntax error around here, but the code is syntactically valid C++17/20. No action required unless the CI build reproduces the issue.
🧰 Tools
🪛 Cppcheck (2.10-2)
[error] 245-245: Syntax Error
(internalAstError)
components/core/src/clp/ffi/ir_stream/Deserializer.hpp (1)
309-316: Filtered-out events are reported asIrUnitType::LogEvent– confirm caller expectationsWhen the query handler evaluates to
False, the switchbreak;skipshandle_log_event, yet the outer function still returnsIrUnitType::LogEvent. Callers may interpret this as “a log event was delivered”.If downstream logic counts or otherwise uses the return value to track processed records, this could lead to under- or over-counting.
Please verify whether the caller relies on this signal. If suppression is required, returning a dedicated enum (e.g.,
IrUnitType::FilteredOut) orIrUnitType::NoOpmight be clearer.
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (2)
components/core/src/clp_s/kv_ir_search.hpp (1)
33-38: Fix enum name in the Doxygen commentThe comment refers to
KvIrStreamErrorEnum, but the actual enum isKvIrSearchErrorEnum. The mismatch will confuse users and IDE intellisense.- * - KvIrStreamErrorEnum::ClpLegacyError if a `clp::TraceableException` is caught. - * - KvIrStreamErrorEnum::CountSupportNotImplemented if count-related features are enabled. - * - KvIrStreamErrorEnum::ProjectionSupportNotImplemented if projection is non-empty. - * - KvIrStreamErrorEnum::StreamReaderCreationFailure if the stream reader cannot be successfully + * - KvIrSearchErrorEnum::ClpLegacyError if a `clp::TraceableException` is caught. + * - KvIrSearchErrorEnum::CountSupportNotImplemented if count-related features are enabled. + * - KvIrSearchErrorEnum::ProjectionSupportNotImplemented if projection is non-empty. + * - KvIrSearchErrorEnum::StreamReaderCreationFailure if the stream reader cannot be successfullycomponents/core/src/clp_s/kv_ir_search.cpp (1)
230-233: Minor: consider early-exit without while-loopOnce
deserialize_next_ir_unitreturnsEndOfStream, the loop condition is re-evaluated and exits. Converting to afor (;;)orwhile (true)with an internalbreakafter EOS can slightly improve readability, but this is purely stylistic.- while (IrUnitType::EndOfStream - != OUTCOME_TRYX(deserializer.deserialize_next_ir_unit(stream_reader))) - {} + for (;;) { + auto ir_unit_type = OUTCOME_TRYX(deserializer.deserialize_next_ir_unit(stream_reader)); + if (IrUnitType::EndOfStream == ir_unit_type) { + break; + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/core/src/clp_s/kv_ir_search.cpp(1 hunks)components/core/src/clp_s/kv_ir_search.hpp(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}`: - Prefer `false == ` rather than `!`.
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}: - Preferfalse == <expression>rather than!<expression>.
components/core/src/clp_s/kv_ir_search.hppcomponents/core/src/clp_s/kv_ir_search.cpp
🪛 Cppcheck (2.10-2)
components/core/src/clp_s/kv_ir_search.cpp
[error] 245-245: Syntax Error
(internalAstError)
⏰ Context from checks skipped due to timeout of 90000ms (11)
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: lint-check (ubuntu-latest)
- GitHub Check: build-macos (macos-14, false)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-13, false)
- GitHub Check: build-macos (macos-13, true)
- GitHub Check: build-macos (macos-14, true)
🔇 Additional comments (4)
components/core/src/clp_s/kv_ir_search.cpp (4)
209-219: LGTM: Query handler creation follows coding guidelinesThe query handler creation code correctly follows the coding guideline of using
false == <expression>rather than!<expression>at line 217, making the code more readable and consistent.
244-247: LGTM: Projection validation follows coding guidelinesThe projection validation code correctly follows the coding guideline of using
false == <expression>rather than!<expression>at line 244, making the code more readable and consistent.🧰 Tools
🪛 Cppcheck (2.10-2)
[error] 245-245: Syntax Error
(internalAstError)
80-85: LGTM: UTC offset change handling is correctGood implementation of the
handle_utc_offset_changefunction. It correctly returnsIRErrorCode::IRErrorCode_Successinstead of an error code, allowing the deserializer to continue processing streams with legitimate timezone transitions without interruption.
159-192:Details
❌ Incorrect review comment
Review error handling strategy for JSON serialization
The current implementation returns
IRErrorCode_Decode_ErrororIRErrorCode_Corrupted_IRdepending on which part of the serialization fails. Consider if these should be distinct error codes to aid debugging, or if they should be consolidated for consistency.Additionally, the logger outputs to the error stream, but the code continues processing other log events. Consider if this is the intended error recovery strategy.
🏁 Script executed:
#!/bin/bash # Look for similar error handling patterns in the codebase rg -A 3 "IRErrorCode::IRErrorCode_(Decode_Error|Corrupted_IR)" --glob "*.{cpp,hpp}"Length of output: 8724
Error codes and recovery strategy are intentional and consistent
The distinction between
IRErrorCode::IRErrorCode_Decode_Error(for JSON parse failures) andIRErrorCode::IRErrorCode_Corrupted_IR(for downstream serialization issues) is documented and exercised incomponents/core/tests/test-ir_encoding_methods.cpp. The pattern of logging the error and returning to continue processing further events matches other IR handlers and is intended for non-fatal errors. No changes needed.Likely an incorrect or invalid review comment.
gibber9809
left a comment
There was a problem hiding this comment.
Just a few minor comments, but change mostly looks good.
gibber9809
left a comment
There was a problem hiding this comment.
LGTM. Maybe for the title
feat(clp-s): Add support for searching KV-pair IR streams through the clp-s cli.
Looking through our past commit messages seems like we usually call it "KV-pair IR format" or "KV-pair IR stream" instead of the shorter "KV-IR".
Sure. Let's keep using "KV-pair IR streams" for this PR to be explicit. In the future PRs, we might use |
… clp-s cli. (y-scope#909) Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Description
References
QueryHandlerinto the IR streamDeserializer. #890.Overview
This PR adds cli support for searching KV-IR streams through the
clp-sexecutable. It is built on top of the currentclp-ssearch utilities, assuming that the input path that contains.clp.zstwill be considered an IR file.As the first PR to support KV-IR search, the current implementation has the following limitations:
stdoutoutput. The current implementation can be extended to support other output methods when needed.clp::ffi::KeyValuePairLogEvent. #905.countoperations.Some behaviours to notice:
{"auto_generated_kv_pairs":{xxx},"user_generated_kv_pairs":{xxx}}..clp.zst, we will fall back to archive search just in case the file path of the archive unintentionally contains the IR extension.Checklist
breaking change.
Validation performed
.clp.zstin its path works as expected (will actually perform archive search).Summary by CodeRabbit
New Features
Bug Fixes
Chores