Problem Statements
In observability, user usually try to ask TopN question to find root cause. For example, user want to find out, on last two week, top 5 high CPU utilization host and visualize CPU utilization of hosts on each day. Currently, PPL does not support this feature in different ways
- No single command, using
stats max(cpu) maxcpu by host | sort 5 maxcpu could return top5 host which have high cpu utilization. If user want to visualize each day's max cpu utilization of these 5 hosts, user must compose another query, e.g. where host in (h1, h2, h3, h4, h5) | status max(cpu) maxcpu by span(timestamp) host.
- Response Format, PPL current response is row format, but visualization library prefer series data. e.g. Using series data, visualization library could use first column as x-axis, and other columns as series.
# row format
day host cpu
1 h1 30
1 h2 20
2 h1 40
3 h2 30
# series format
day h1 h2
1 30 20
2 40 30
Requirements
Chart Command
Overview
Create a chart with a corresponding table of statistics.
Syntax
chart <aggregation> OVER <row-split> [BY <column-split> [limit=(top|bottom)<int>]]
- aggregation: A statistical aggregation function.
- limit: Specifies a limit for the number of distinct values of column-split field to return, Default=top 5.
- row-split: The field that you specify becomes the first column in the results table. The field values become the row labels in the results table. In a chart, the field name is used to label the X-axis. The field values become the X-axis values.
- column-split: Specifies a field to use as the columns in the result table. By default, when the result are visualized, the columns become the data series in the chart.
Example
- Example 1. For each response status, visualize top2 host’s request count. Top2 means, in a period, Top2 host has most request count.
Note: One aggregation, one row-split field, one column-split field.
chart sum(req) OVER status BY host limit top=2
# sample response
status host1 host2
200 100 200
404 10 20
503 1 2
- Example 2. For each day, visualize top2 host’s cpu utilization. Top2 means, in a period, Top2 host has highest cpu utilization.
Note: One aggregation, one span row-split field, one column-split field.
chart max(cpu) OVER span(timestamp, 1d) BY host limit top=2
# sample response
timestamp host3 host4
Day1 30 20
Day2 20 70
Day3 50 50
- Example 3. For each day, visualize top2 host cpu utilization and request count. Top2 means, in a period, Top2 host has highest cpu utilization.
Note: Two aggregations, one span row-split field, one column-split field.
chart max(cpu), sum(req) OVER span(timestamp, 1d) BY host limit top=2
# sample response
max(cpu)
timestamp host3 host4
Day1 30 20
Day2 20 70
Day3 50 50
sum(req)
timestamp host3 host4
Day1 200 100
Day2 100 900
Day3 800 500
- Example 4. For each response status, total request count
chart sum(req) OVER status
# sample response
status sum
200 300
404 30
503 3
Problem Statements
In observability, user usually try to ask TopN question to find root cause. For example, user want to find out, on last two week, top 5 high CPU utilization host and visualize CPU utilization of hosts on each day. Currently, PPL does not support this feature in different ways
stats max(cpu) maxcpu by host | sort 5 maxcpucould return top5 host which have high cpu utilization. If user want to visualize each day's max cpu utilization of these 5 hosts, user must compose another query, e.g.where host in (h1, h2, h3, h4, h5) | status max(cpu) maxcpu by span(timestamp) host.Requirements
Chart Command
Overview
Create a chart with a corresponding table of statistics.
Syntax
Example