Skip to content

[PoC] SQL/PPL support for analytics engine#10

Draft
dai-chen wants to merge 15 commits intomainfrom
poc/analytics-integ-unified-query-pipeline-v2
Draft

[PoC] SQL/PPL support for analytics engine#10
dai-chen wants to merge 15 commits intomainfrom
poc/analytics-integ-unified-query-pipeline-v2

Conversation

@dai-chen
Copy link
Copy Markdown
Owner

@dai-chen dai-chen commented Mar 17, 2026

PoC Result: Unified Query Pipeline (Option B)

Summary

unified-query-pipeline

This PoC validates the end-to-end unified query pipeline as shown in the diagram above:

  1. RestSqlAction routes queries targeting non-Lucene indices to the new RestUnifiedQueryAction; Lucene queries fall through to the existing V2/V3 path unchanged.
  2. RestUnifiedQueryAction calls OpenSearchSchemaBuilder.buildSchema(clusterState) from the Analytics engine to build a Calcite schema with standard SQL types.
  3. UnifiedQueryPlanner.plan() parses the query (Calcite SQL or PPL V3 parser) and generates a logical RelNode against the schema.
  4. AnalyticsExecutionEngine passes the RelNode directly to the Analytics engine's QueryPlanExecutor.execute() with no serialization needed.
  5. QueryPlanExecutor returns Iterable<Object[]> results, which are converted to ExprValue and formatted as default JSON via the existing JdbcResponseFormatter.

Core classes: RestUnifiedQueryAction, AnalyticsExecutionEngine

Key Areas Verified

  • UDT/UDF Impact

    • Verify whether PPL V3's custom datetime UDTs (VARCHAR-backed) contaminate the RelNode when the schema uses standard Calcite types.
    • No impact. OpenSearchSchemaBuilder maps fields to standard Calcite types and unified query planner bypass OpenSearchTypeFactory. Datetime functions like hour() resolve correctly with standard datetime type.
    • Code: RelNode explain output, RelNode object dump
  • RelNode Handoff

    • Verify whether RelNode serialization is needed to pass plans between the SQL plugin and Analytics engine.
    • Not needed. Analytics engine is extensible plugin that shares the classloader with SQL plugin, enabling direct in-process RelNode passing to QueryPlanExecutor.execute().
  • Optimization Boundary

    • Verify where query optimization should live — SQL plugin, Analytics engine, or both.
    • Both. SQL plugin can apply language-specific logical rewrites. Analytics engine applies engine-specific pushdown via custom TableScan.register() rules. Unified query library can provide a new UnifiedQueryPlanner.optimize() API if needed.
    • Code: Example Calcite adapter, New optimize API

Component Responsibility Proposal

SQL/PPL Plugin

  1. Query routing: Detect non-Lucene indices (e.g., by index settings) and route to the unified query pipeline instead of the existing V2/V3 engine
  2. Parsing: Parse SQL via Calcite's native SqlParserSqlValidatorSqlToRelConverter, or PPL via ANTLR → CalciteRelNodeVisitor, producing a logical RelNode
  3. Logical optimization: Apply engine-agnostic HepPlanner rules (FilterMerge, PPL-specific rewrites) to simplify the logical plan before handoff [TBD]
  4. Handoff: Pass the logical RelNode to the Analytics engine's QueryPlanExecutor.execute()
  5. Response formatting: Convert Iterable<Object[]> results from the Analytics engine into default response using existing JdbcResponseFormatter
  6. Error handling: Catch exceptions from the Analytics engine (unsupported operations, execution failures) and return appropriate error responses
  7. Observability: Log planning and execution time; increment request/failure metrics (REQ_TOTAL, REQ_COUNT_TOTAL, FAILED_REQ_COUNT_SYS)
  8. Thread management: Schedule execution on sql-worker thread pool to avoid blocking transport threads

Analytics Engine Plugin

  1. Schema provisioning: Provide OpenSearchSchemaBuilder to build Calcite SchemaPlus from cluster state index mappings with standard Calcite types (no UDTs)
  2. Physical optimization: Replace LogicalTableScan with engine-specific physical scan node, register pushdown rules (filter, project, aggregate, sort, limit) via TableScan.register()
  3. Execution: Execute the physical plan via back-end engines (DataFusion, etc.) and return Iterable<Object[]> result rows
  4. Fail-fast: Throw clear exceptions for unsupported operations (e.g., full-text search functions on Parquet)

API Contracts

SQL/PPL Plugin provides:

  • RelNode: Logical query plan generated from PPL V3 queries (ANTLR → CalciteRelNodeVisitor) and ANSI SQL queries (Calcite native SqlParser → SqlValidator → SqlToRelConverter), including OpenSearch-specific functions (match, match_phrase) as UDF nodes in the plan.

Analytics Engine Plugin provides:

  • Query routing rule: Index settings-based detection of non-Lucene storage engine to determine whether a query should be routed to the unified query pipeline or the existing V2/V3 path
  • SchemaBuilder: OpenSearchSchemaBuilder.buildSchema(ClusterState) returns a Calcite SchemaPlus with standard SQL types (no UDTs) derived from index mappings (e.g., date → TIMESTAMP, keyword → VARCHAR)
  • QueryPlanExecutor: execute(RelNode, context) accepts a logical plan and returns Iterable<Object[]> result rows with column order matching RelNode.getRowType().getFieldList(); schema column types derived from the RelNode's row type

dai-chen added 11 commits March 16, 2026 13:24
Add restructured PoC plan with two key investigation areas:
1) UDT/UDF impact on RelNode processing in Analytics engine
2) RelNode (de-)serialization via Calcite JSON serializer

Phases reordered to tackle SQL/PPL RelNode generation first,
then routing and REST handler, then Analytics engine integration.
Phase 1: extract ANSI SQL logic from UnifiedQueryPlanner rather than
copying the whole file. Note mock schema is temporary until replaced
by Analytics engine SchemaProvider in Phase 4.

Phase 4: add reference link to OpenSearch sandbox plugins.
Add Calcite native SQL parser path (SqlParser -> SqlValidator ->
SqlToRelConverter -> RelNode) to UnifiedQueryPlanner for SQL queries.
PPL path remains unchanged using ANTLR parser -> AST -> CalciteRelNodeVisitor.

Logical optimization deferred to Phase 4 when Analytics engine schema
with pushdown rules is integrated.
Phase 2: PPL queries against a schema with all 5 standard Calcite
types (TIMESTAMP, DATE, TIME, VARBINARY, VARCHAR) produce RelNode
plans with standard SqlTypeName preserved — no UDT interference.

Key Area 1 finding: when schema bypasses OpenSearchTypeFactory,
PPL datetime functions (hour, day) resolve correctly via
PPLTypeChecker bridge and return standard INTEGER type.
Phase 3: Wire end-to-end unified query pipeline:
- AnalyticsExecutionEngine stubs transport action handoff to Analytics
  engine, returns empty results (TODO: serialize RelNode and submit)
- RestUnifiedQueryAction routes through UnifiedQueryPlanner -> RelNode
  -> AnalyticsExecutionEngine, formats via JdbcResponseFormatter
- RestSqlAction routes SQL queries with 'parquet_' to unified path
- RestPPLQueryAction routes PPL queries with 'parquet_' to unified path
- Add :api dependency to legacy/ and plugin/ modules
… builder

Phase 4: Analytics engine integration:
- New analytics-engine-stub/ submodule with SPI interfaces copied from
  analytics-framework (EngineContext, QueryPlanExecutor, SchemaProvider,
  EngineBridge, AnalyticsBackEndPlugin)
- OpenSearchSchemaBuilder copied from analytics-engine: builds Calcite
  SchemaPlus from ClusterState index mappings with standard SqlTypeName
- RestUnifiedQueryAction now uses real cluster state schema via
  OpenSearchSchemaBuilder instead of empty context
- Wire ClusterService through RestSqlAction and RestPPLQueryAction
Verify end-to-end flow: REST routing -> UnifiedQueryPlanner -> RelNode
-> AnalyticsExecutionEngine (stub) -> JdbcResponseFormatter.

AnalyticsExecutionEngine now derives schema from RelNode row type
instead of returning empty schema, enabling proper JDBC response
with correct column names and types.

Tests: SQL select with filter, SQL aggregate, non-parquet regression.
Add 7 integration tests covering SQL/PPL query and explain paths:
- SQL select with filter, SQL aggregate
- PPL where+project, PPL stats
- PPL datetime function (Key Area 1: no UDT in plan)
- PPL match() search function (MAP arguments in plan)
- Non-parquet regression test

Each test uses fluent withSQL/withPPL().verifySchema().verifyDataRows()
.verifyExplain() pattern asserting full logical plan output.

Refactor: move isUnifiedQueryPath() to RestUnifiedQueryAction,
deduplicate routing logic from RestSqlAction and RestPPLQueryAction.
Key findings from Phase 5 investigation:
- Transport action is NOT needed: both plugins run in same JVM
- Analytics engine uses ExtensiblePlugin for SPI discovery, not
  transport actions. Exposes QueryPlanExecutor via Guice bindings.
- SQL plugin has its own private Guice injector, isolated from the
  node-level injector where Analytics engine bindings live.
- RelNode can be passed directly in-process, no serialization needed.
- Calcite JSON serializer works as fallback for future cross-node needs.

Replace Phase 5 (serialization) and Phase 6 (transport action) with
Phase 5 (direct Guice integration with QueryPlanExecutor).
…findings

Key finding: each OpenSearch plugin gets its own URLClassLoader.
Even though both SQL and Analytics plugins bundle calcite-core:1.41.0,
RelNode is a different Class in each classloader -> ClassCastException.

Solution: SQL plugin declares extendedPlugins = ['analytics-engine'],
inheriting Analytics engine's classloader as parent. Both share the
same Calcite classes, enabling direct RelNode passing without
serialization.

Calcite JSON serialization validated as fallback option.
Phase 5: Direct plugin extension integration:
- Remove analytics-engine-stub module, use real analytics-framework
  and analytics-engine jars from OpenSearch sandbox build in libs/
- SQL plugin declares extendedPlugins = ['analytics-engine'], sharing
  Calcite classloader — no serialization needed for RelNode handoff
- AnalyticsExecutionEngine uses real QueryPlanExecutor interface
- Exclude overlapping jars from SQL plugin zip to avoid jar hell
- Analytics-engine plugin loaded in integ-test cluster
- All 7 ITs pass with real analytics-engine plugin loaded
@dai-chen dai-chen self-assigned this Mar 17, 2026
Document PoC findings including:
- Key Area 1: UDT interference (none with standard Calcite types)
- Key Area 2: RelNode handoff (direct via extendedPlugins, no serialization)
- Key Area 3: Optimization boundary (SQL plugin owns logical rewrites,
  Analytics engine owns physical pushdown)
- Component responsibilities ordered by workflow
- API contract: routing, schema, executor, result format
…atting

Wire Iterable<Object[]> from QueryPlanExecutor through ExprTupleValue
to the existing JdbcResponseFormatter chain:
- Each Object[] row mapped to ExprTupleValue keyed by column names
  from RelNode.getRowType()
- ExprValueUtils.fromObjectValue() handles all standard Java types
- QueryResponse(Schema, List<ExprValue>, Cursor) flows through
  existing formatter unchanged
Phase 6 cross-cutting concerns:
- Schedule execution on sql-worker thread pool to avoid blocking
  transport threads (same pattern as OpenSearchExecutionEngine)
- Log planning and execution time with [unified] prefix
- Increment REQ_TOTAL/REQ_COUNT_TOTAL on success,
  FAILED_REQ_COUNT_SYS on failure
- Update PoC result doc with observability and thread management
  responsibilities
Demonstrate RelNode object tree structure via RelJsonWriter:
- demoTableScanIsStandardCalcite: LogicalTableScan with no custom node
- demoDatetimeFunctionProducesStandardTypes: HOUR as UDF producing
  standard INTEGER type, no UDT
- demoMatchFunctionIsPplUdf: match() as PPL UDF with MAP operands,
  class=UserDefinedFunctionBuilder - Analytics engine must handle/reject

Add UnifiedQueryPlanner.optimize() API:
- Runs VolcanoPlanner with adapter-specific pushdown rules
- Test simulates Analytics engine with custom EngineTableScan and
  filter absorption rule: LogicalFilter absorbed into EngineTableScan
  after optimize()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant