-
-
Notifications
You must be signed in to change notification settings - Fork 757
Description
In Dask's dashboards we have good live tracking of many system resources including: CPU Utilization, Memory, Bandwidth, Num FDs, GPU Utilization, and GPU Memory:
Examples of Resource Tracking
Some times however, we want to ask questions of a workflow like what was the peak memory usage ? What was the CPU/GPU utilization like over the course of a query ? In trying to answer these and other questions, I think it would beneficial for this information to be include in performance reports. Not sure how best to build this. Perhaps starting with a generic timeseries tracking plot which sampled a resource. My first thought about attacking this problem was using a scheduler plugin for timeseries tracking which created a periodic callback to sample a particular resource.
We could also use TaskState Metadata for storing resource usage information
Here is a list of endpoints the scheduler is aware of:
distributed/distributed/dashboard/scheduler.py
Lines 82 to 92 in 7d2a22f
| "/individual-nbytes": individual_nbytes_doc, | |
| "/individual-cpu": individual_cpu_doc, | |
| "/individual-nprocessing": individual_nprocessing_doc, | |
| "/individual-workers": individual_workers_doc, | |
| "/individual-bandwidth-types": individual_bandwidth_types_doc, | |
| "/individual-bandwidth-workers": individual_bandwidth_workers_doc, | |
| "/individual-memory-by-key": individual_memory_by_key_doc, | |
| "/individual-compute-time-per-key": individual_compute_time_per_key_doc, | |
| "/individual-aggregate-time-per-action": individual_aggregate_time_per_action_doc, | |
| "/individual-gpu-memory": gpu_memory_doc, | |
| "/individual-gpu-utilization": gpu_utilization_doc, |

