Skip to content

Commit 2f7281c

Browse files
authored
[Feature][Metrics] Add resource download related metrics for workers (#10749)
* [Feature][Metrics] Add resource download related metrics for workers (#9324) * [Feature][Metrics] Fix bugs and add grafana demos for worker resource download metrics (#9324) * [Feature][Metrics] Add docs to resource related metrics (#9324) * [Feature][Metrics] Use tags to indicate status in metrics (#9324) * [Feature][Metrics] Fix demos, docs and remove redundant code (#9324) * [Feature][Metrics] Remove .pnpm-debug.log (#9324) * [Feature][Metrics] Fix style check (#9324) * [Feature][Metrics] Replace KB with bytes for the unit of resource file size in metrics (#9324) * [Feature][Metrics] Make code neat (#9324)
1 parent 56fe11e commit 2f7281c

5 files changed

Lines changed: 460 additions & 14 deletions

File tree

docs/docs/en/guide/metrics/metrics.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ For example, you can get the master metrics by `curl http://localhost:5679/actua
7474
- ds.task.execution.count.by.type: (counter) the number of task executions grouped by tag `task_type`
7575
- ds.task.running: (gauge) the number of running tasks
7676
- ds.task.prepared: (gauge) the number of tasks prepared for task queue
77-
- ds.task.execution.count: (histogram) the number of executed tasks
77+
- ds.task.execution.count: (counter) the number of executed tasks
7878
- ds.task.execution.duration: (histogram) duration of task executions
7979

8080

@@ -103,6 +103,9 @@ For example, you can get the master metrics by `curl http://localhost:5679/actua
103103

104104
- ds.worker.overload.count: (counter) the number of times the worker overloaded
105105
- ds.worker.full.submit.queue.count: (counter) the number of times the worker's submit queue being full
106+
- ds.worker.resource.download.count: (counter) the number of downloaded resource files on workers, sliced by tag `status`
107+
- ds.worker.resource.download.duration: (histogram) the time cost of resource download on workers
108+
- ds.worker.resource.download.size: (histogram) the sizes of downloaded resource files on workers (bytes)
106109

107110
### Api Server Metrics
108111

docs/docs/zh/guide/metrics/metrics.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,9 @@ metrics exporter端口`server.port`是在application.yaml里定义的: master: `
104104

105105
- ds.worker.overload.count: (counter) worker过载次数
106106
- ds.worker.full.submit.queue.count: (counter) worker提交队列全满次数
107+
- ds.worker.resource.download.count: (counter) worker下载资源文件的次数,可由`status`标签切分
108+
- ds.worker.resource.download.duration: (histogram) worker下载资源文件时花费的时间分布
109+
- ds.worker.resource.download.size: (histogram) worker下载资源文件大小的分布(bytes)
107110

108111
### Api Server指标
109112

0 commit comments

Comments
 (0)