-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[data] Task metric improvements #55429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
5e4427a
9440a90
6e2979a
b7df691
f9fc170
b5bbc19
6206d60
928ee06
ac4c179
c8a9f06
09205e9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -460,97 +460,41 @@ | |
| ), | ||
| Panel( | ||
| id=38, | ||
| title="(p00) Task Completion Time", | ||
| description="Time spent running tasks to completion.", | ||
| title="Task Completion Time", | ||
| description="Time spent running tasks to completion w/ backpressure.", | ||
| unit="seconds", | ||
| targets=[ | ||
| Target( | ||
| expr="histogram_quantile(0, sum by (dataset, operator, le) (rate(ray_data_task_completion_time_bucket{{{global_filters}}}[5m])))", | ||
| legend="(p00) Completion Time: {{dataset}}, {{operator}}", | ||
| expr="increase(ray_data_task_completion_time{{{global_filters}}}[5m]) / increase(ray_data_num_tasks_finished{{{global_filters}}}[5m])", | ||
| legend="Task Completion Time: {{dataset}}, {{operator}}", | ||
| ), | ||
| ], | ||
| fill=0, | ||
| stack=False, | ||
| ), | ||
| Panel( | ||
| id=39, | ||
| title="(p05) Task Completion Time", | ||
| description="Time spent running tasks to completion.", | ||
| title="Task Output Backpressure Time", | ||
| description="Time spent in output backpressure.", | ||
| unit="seconds", | ||
| targets=[ | ||
| Target( | ||
| expr="histogram_quantile(0.05, sum by (dataset, operator, le) (rate(ray_data_task_completion_time_bucket{{{global_filters}}}[5m])))", | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With the previous implementation of
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh, this part is a bug. I think alexey renamed the previous and current(I didn't change it in this PR) implementation currently sends a running mean metric |
||
| legend="(p05) Completion Time: {{dataset}}, {{operator}}", | ||
| expr="increase(ray_data_task_output_backpressure_time{{{global_filters}}}[5m]) / increase(ray_data_num_tasks_finished{{{global_filters}}}[5m])", | ||
| legend="Task Output Backpressure Time: {{dataset}}, {{operator}}", | ||
| ), | ||
| ], | ||
| fill=0, | ||
| stack=False, | ||
| ), | ||
| Panel( | ||
| id=40, | ||
| title="(p50) Task Completion Time", | ||
| description="Time spent running tasks to completion.", | ||
| title="Task Completion Time Without Backpressure", | ||
| description="Time spent running tasks to completion w/o backpressure.", | ||
| unit="seconds", | ||
| targets=[ | ||
| Target( | ||
| expr="histogram_quantile(0.50, sum by (dataset, operator, le) (rate(ray_data_task_completion_time_bucket{{{global_filters}}}[5m])))", | ||
| legend="(p50) Completion Time: {{dataset}}, {{operator}}", | ||
| ), | ||
| ], | ||
| fill=0, | ||
| stack=False, | ||
| ), | ||
| Panel( | ||
| id=41, | ||
| title="(p75) Task Completion Time", | ||
| description="Time spent running tasks to completion.", | ||
| unit="seconds", | ||
| targets=[ | ||
| Target( | ||
| expr="histogram_quantile(0.75, sum by (dataset, operator, le) (rate(ray_data_task_completion_time_bucket{{{global_filters}}}[5m])))", | ||
| legend="(p75) Completion Time: {{dataset}}, {{operator}}", | ||
| ), | ||
| ], | ||
| fill=0, | ||
| stack=False, | ||
| ), | ||
| Panel( | ||
| id=42, | ||
| title="(p90) Task Completion Time", | ||
| description="Time spent running tasks to completion.", | ||
| unit="seconds", | ||
| targets=[ | ||
| Target( | ||
| expr="histogram_quantile(0.9, sum by (dataset, operator, le) (rate(ray_data_task_completion_time_bucket{{{global_filters}}}[5m])))", | ||
| legend="(p90) Completion Time: {{dataset}}, {{operator}}", | ||
| ), | ||
| ], | ||
| fill=0, | ||
| stack=False, | ||
| ), | ||
| Panel( | ||
| id=44, | ||
| title="p(99) Task Completion Time", | ||
| description="Time spent running tasks to completion.", | ||
| unit="seconds", | ||
| targets=[ | ||
| Target( | ||
| expr="histogram_quantile(0.99, sum by (dataset, operator, le) (rate(ray_data_task_completion_time_bucket{{{global_filters}}}[5m])))", | ||
| legend="(p99) Completion Time: {{dataset}}, {{operator}}", | ||
| ), | ||
| ], | ||
| fill=0, | ||
| stack=False, | ||
| ), | ||
| Panel( | ||
| id=45, | ||
| title="p(100) Task Completion Time", | ||
| description="Time spent running tasks to completion.", | ||
| unit="seconds", | ||
| targets=[ | ||
| Target( | ||
| expr="histogram_quantile(1, sum by (dataset, operator, le) (rate(ray_data_task_completion_time_bucket{{{global_filters}}}[5m])))", | ||
| legend="(p100) Completion Time: {{dataset}}, {{operator}}", | ||
| expr="increase(ray_data_task_completion_time_without_backpressure{{{global_filters}}}[5m]) / increase(ray_data_num_tasks_finished{{{global_filters}}}[5m])", | ||
| legend="Task Completion Time w/o Backpressure: {{dataset}}, {{operator}}", | ||
| ), | ||
| ], | ||
| fill=0, | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qq, where is the pxx filter added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided to ultimately remove it because of how, in general, ray data percentiles are incorrectly calculated. We will revisit it later, though, and if ur curious, the PR for that is https://github.com/iamjustinhsu/ray/pull/1/files#diff-0633fe346cc983e2e5518eba4b75f4067735898a304812be9b18d5652fc7e00dR134-R179