Skip to content

[RFC] Support eval command in PPL #4014

@ahkcs

Description

@ahkcs

Description

The PPL eval command has feature gaps that affects essential data processing workflows. String concatenation operations fail with runtime errors, numeric division loses precision, and the command lacks distributed execution optimization.

Current Behavior vs Expected Behavior

1. String Concatenation Failure

Current Behavior: String concatenation using the + operator fails with a runtime error when attempting to combine string fields or literals.

Expected Behavior: The + operator should support string concatenation to enable text processing workflows.

Error Message:

java.lang.RuntimeException: java.sql.SQLException: Error while preparing plan
LogicalProject... full_name=[+(+($1, ' '), $10)]

2. Integer Division Precision Loss

Current Behavior: PPL performs strict integer division, resulting in truncated results when dividing integer values.

Expected Behavior: Division operations should preserve decimal precision for accurate calculations.

Example:

  • Input: balance = 16623, Operation: balance / 1000
  • Current Result: 16 (truncated)
  • Expected Result: 16.623 (with decimal precision)

Related issue:
#3946

3. Performance Architecture Limitation

Current Behavior: The eval command executes only on the coordination node without being rewritten to OpenSearch DSL.

Expected Behavior: Eval operations should be optimized and pushed down to OpenSearch DSL for distributed execution and better performance.

Related issue:
#3387

Integer Division Issue:

# Division returns integer results instead of decimals
curl -X POST "localhost:9200/_plugins/_ppl" \
-H "Content-Type: application/json" \
-d '{
  "query": "source=accounts | eval balance_k = balance / 1000, daily_rate = balance / 365 | sort account_number | fields account_number, balance, balance_k, daily_rate | head 3"
}'

Impact

  • String Operations: Complete blockage of text processing and string manipulation workflows
  • Numeric Precision: Loss of precision in financial calculations, statistical analysis, and ratio computations
  • Performance: Potential bottleneck for large-scale data processing due to single-node execution

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languageRFCRequest For Comments

    Type

    No type

    Projects

    Status

    New

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions