Skip to content

Timeseries Resource Usage Tracking #4494

@quasiben

Description

@quasiben

In Dask's dashboards we have good live tracking of many system resources including: CPU Utilization, Memory, Bandwidth, Num FDs, GPU Utilization, and GPU Memory:

Examples of Resource Tracking

Screen Shot 2021-02-09 at 9 51 57 AM

Screen Shot 2021-02-09 at 9 52 40 AM

Some times however, we want to ask questions of a workflow like what was the peak memory usage ? What was the CPU/GPU utilization like over the course of a query ? In trying to answer these and other questions, I think it would beneficial for this information to be include in performance reports. Not sure how best to build this. Perhaps starting with a generic timeseries tracking plot which sampled a resource. My first thought about attacking this problem was using a scheduler plugin for timeseries tracking which created a periodic callback to sample a particular resource.

We could also use TaskState Metadata for storing resource usage information

Here is a list of endpoints the scheduler is aware of:

"/individual-nbytes": individual_nbytes_doc,
"/individual-cpu": individual_cpu_doc,
"/individual-nprocessing": individual_nprocessing_doc,
"/individual-workers": individual_workers_doc,
"/individual-bandwidth-types": individual_bandwidth_types_doc,
"/individual-bandwidth-workers": individual_bandwidth_workers_doc,
"/individual-memory-by-key": individual_memory_by_key_doc,
"/individual-compute-time-per-key": individual_compute_time_per_key_doc,
"/individual-aggregate-time-per-action": individual_aggregate_time_per_action_doc,
"/individual-gpu-memory": gpu_memory_doc,
"/individual-gpu-utilization": gpu_utilization_doc,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions