You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the implementation plan for #747. The prototype implementation is in this branch.
Overview
As an extension of #747, this issue discusses the implementation plan and tracks the implementation progress. To support searching kv-pair IR stream, we will break the required features into a PR series with the following steps:
Relevant KQL AST Utilities to make clp-s KQL query compatible with kv-pair IR stream.
Projection Handler Interface to define the projection handling interface.
Query Handler to handle the core search logic during deserialization.
Deserializer Integration to integrate the features above into the current IR deserializer.
clp-s Integration to integrate the overall search feature into clp-s' cli.
Relevant KQL AST Utilities
The current KQL AST is designed to operate on clp-s archives, which have a different schema tree implementation comparing to kv-pair IR stream.
To adapt these differences, we need some utility code to:
Convert IR's schema tree type to clp_s::search::ast::LiteralTypeBitmask
Convert IR's node-type-value-pair to clp_s::search::ast::LiteralType
Evaluate a KQL filter expression against the deserialized IR value.
Projection Handler Interface
As discussed in #747, users need to define their own projection resolution handler to maintain the full-key-to-node-ID mapping across the stream.
This mapping should be applied by user-level code, which is outside the deserializer.
The Projection Handler is a concept that defines the interface for users to implement their own logic to handle projection resolution.
Query Handler
Query handler is an object that:
Hold all the necessary data structures to proceed streaming IR search/projection.
Be responsible for evaluating the query on the deserialized node-id-value-pairs.
As designed to be a part of the deserializer, the following APIs are needed:
column_resolution_update: Handle column resolution and update the relevant key-to-node-ID mapping required by the query or the projection (possibly calling the projection handler). Will be called whenever schema tree node insertion IR unit is deserialized.
evaluate_node_id_value_pairs: Execute the query on the given node-ID-value-pairs. Will be called whenever the log event IR unit is deserialized.
This object implements the core search logic so most of the engineering efforts will be spent here.
Deserializer Integration
The components discussed above will eventually be integrated into the current IR deserializer. As discussed above, if a query is given, the deserializer will behave as following to handle the query:
For schema tree node insertion IR unit, the deserializer will call the relevant query handler's API to update the column resolution.
For log event IR unit, the deserializer will call the relevant query handler's to evaluate the query, and only call user-defined log event handler if the query evaluates to true.
If the query evaluates to false, a dedicated error code will be returned to indicate that a log event has been successfully deserialized but failed to match the query.
To avoid potential overhead on deserialization without a query, we can use templates to statically determine whether a query handler branch should be involved.
clp-s Integration
This is the final step in this PR series, which integrates the above components into the clp-s and exposes the basic search features through the command line.
This is the implementation plan for #747. The prototype implementation is in this branch.
Overview
As an extension of #747, this issue discusses the implementation plan and tracks the implementation progress. To support searching kv-pair IR stream, we will break the required features into a PR series with the following steps:
clp-s' cli.Relevant KQL AST Utilities
The current KQL AST is designed to operate on clp-s archives, which have a different schema tree implementation comparing to kv-pair IR stream.
To adapt these differences, we need some utility code to:
clp_s::search::ast::LiteralTypeBitmaskclp_s::search::ast::LiteralTypeProjection Handler Interface
As discussed in #747, users need to define their own projection resolution handler to maintain the full-key-to-node-ID mapping across the stream.
This mapping should be applied by user-level code, which is outside the deserializer.
The Projection Handler is a concept that defines the interface for users to implement their own logic to handle projection resolution.
Query Handler
Query handler is an object that:
As designed to be a part of the deserializer, the following APIs are needed:
This object implements the core search logic so most of the engineering efforts will be spent here.
Deserializer Integration
The components discussed above will eventually be integrated into the current IR deserializer. As discussed above, if a query is given, the deserializer will behave as following to handle the query:
To avoid potential overhead on deserialization without a query, we can use templates to statically determine whether a query handler branch should be involved.
clp-s Integration
This is the final step in this PR series, which integrates the above components into the
clp-sand exposes the basic search features through the command line.Dependency Graph
PRs should be scheduled according to this flowchart.