Is your feature request related to a problem?
PPL currently lacks support for per_* aggregation functions (per_second, per_minute, per_hour, per_day). These functions calculate rate-based metrics by normalizing aggregated values to specific time units, converting raw counts into meaningful per-unit rates.
Without these functions, users cannot easily perform rate calculations that are common in performance monitoring scenarios, such as calculating packets per second, requests per minute when using the timechart command.
What solution would you like?
Implement the four per_* aggregation functions in PPL:
per_second(<value>) - Returns values normalized to per-second rate
per_minute(<value>) - Returns values normalized to per-minute rate
per_hour(<value>) - Returns values normalized to per-hour rate
per_day(<value>) - Returns values normalized to per-day rate
These functions should work exclusively with the timechart command (due to implicit timestamp field dependency):
# Sample data
{\"_time\":\"2025-09-08T10:00:00\", \"packets\":10},
{\"_time\":\"2025-09-08T10:00:05\", \"packets\":60},
{\"_time\":\"2025-09-08T10:00:30\", \"packets\":20},
{\"_time\":\"2025-09-08T10:00:50\", \"packets\":30}
# Example 1
...
| eval _time=strptime(_time, "%Y-%m-%dT%H:%M:%S")
| timechart per_second(packets) span=1m
----------------------|--------------------
_time | per_second(packets)
----------------------|--------------------
2025-09-08T10:00:00 | 2 # (10+60+20+30)/60s
# Example 2
timechart per_second(packets) span=20s
----------------------|--------------------
_time | per_second(packets)
----------------------|--------------------
2025-09-08T10:00:00 | 3.5 # (10+60)/20s
2025-09-08T10:00:20 | 1 # (20)/20s
2025-09-08T10:00:40 | 1.5 # (30)/20s
# Example 3
timechart per_minute(packets), per_hour(packets), per_day(packets) span=1m
----------------------|--------------------|--------------------|--------------------
_time | per_minute(packets)| per_hour(packets) | per_day(packets)
----------------------|--------------------|--------------------|--------------------
2025-09-08T10:00:00 | 120 | 7200 | 172800
What alternatives have you considered?
- Manual calculation: Users can manually divide aggregated values by time span, but this requires knowledge of time conversion factors.
Do you have any additional context?
Implementation approaches
- Short-term solution: Implement rewriting for fixed-width buckets
- Currently PPL
timechart only supports span option which is fixed interval
- We can simply transform
per_* functions to mathematical formulas at compile time
# Example
... | timechart per_second(packets) span=1m
=>
SELECT SUM(packets) / 60
...
GROUP BY SPAN(@timestamp, 1m)
- Long-term solutions [TBD]
- The primary challenge lies in dynamic bucketing behavior in bin-options.
- Option 1: Output bounds from bucketing function similar as windowing function in Spark SQL
- Option 2: Dynamic calculation via
LEAD window function to determine next bucket's start time
Is your feature request related to a problem?
PPL currently lacks support for
per_*aggregation functions (per_second,per_minute,per_hour,per_day). These functions calculate rate-based metrics by normalizing aggregated values to specific time units, converting raw counts into meaningful per-unit rates.Without these functions, users cannot easily perform rate calculations that are common in performance monitoring scenarios, such as calculating packets per second, requests per minute when using the
timechartcommand.What solution would you like?
Implement the four
per_*aggregation functions in PPL:per_second(<value>)- Returns values normalized to per-second rateper_minute(<value>)- Returns values normalized to per-minute rateper_hour(<value>)- Returns values normalized to per-hour rateper_day(<value>)- Returns values normalized to per-day rateThese functions should work exclusively with the
timechartcommand (due to implicit timestamp field dependency):What alternatives have you considered?
Do you have any additional context?
Implementation approaches
timechartonly supportsspanoption which is fixed intervalper_*functions to mathematical formulas at compile timeLEADwindow function to determine next bucket's start time