Skip to content

[RFC] Enhance sort command in PPL #3931

@ritvibhatt

Description

@ritvibhatt

Enhance Sort Command in PPL

1. Overview

Description

This RFC proposes enhancing the existing sort command to add support for result count limiting, field type specification, and global sort direction reversal. These features solve common sorting challenges like getting top results and handling mixed data types.

Use Cases

  • Limit the number of sorted results returned
  • Apply type specification to ensure correct sorting behavior for different data types
  • Reverse the hierarchical sort order across all fields

2. Proposed Syntax

sort [count] [+|-] sort-field, [+|-] sort-field, ... [desc|d]

Arguments

  • count (optional): Integer value limiting the number of results returned (default: 10000)
  • [+|-] (optional): Sort direction prefix per field
    • +: Ascending order with NULL/MISSING first (default)
    • -: Descending order with NULL/MISSING last
  • sort-field: Field name to sort by (mandatory), with optional type specification:
    • auto(field): Use field's natural type (default behavior)
    • str(field): Sort field as string type
    • num(field): Sort field as numeric (double) type
    • ip(field): Sort field as IP address type
  • desc or d (optional): Global modifier that reverses the sort direction for all fields. If multiple fields are specified, reverses order of first field then for all duplicate values of the first field, reverses the order of the values of the second field and so on.

3. Usage Examples

  • Limit results to top 5 oldest employees
    source=employees | sort 5 -age

  • Sort logs by source IP address treated as IP type, then by response_time treated as
    numeric
    source=logs | sort ip(source_ip), +num(response_time)

  • Sort employees by department and age, with both fields in descending order
    source=employees | sort department, age desc

4. Implementation Details

Count Limiting: Extend the existing Sort AST node to accept an optional count parameter, integrating with the existing limit functionality.

Type Specification: Generate Cast expressions during parsing to ensure fields are processed with the specified data types during sorting.

Global Direction Reversal: Modify sort field direction arguments when the global modifier is present, allowing existing sort logic to handle the reversed directions.

Metadata

Metadata

Assignees

Labels

PPLPiped processing languagecalcitecalcite migration releatedenhancementNew feature or requestv3.3.0

Type

No type

Projects

Status

New

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions