perf: replace gopsutil sampling with cached metrics and optimize system info retrieval #11079

HynoR · 2025-11-26T06:51:04Z

What this PR does / why we need it?

#11077 #11078

当前调用CPU获取usage的方法每次都需要sleep 100ms,很消耗时间
很多几乎不变的数据每次都得调用IO来获取
systemproxy 值始终不变

Summary of your change

自行实现 CPU 库，缓存上一次的 cpu 累计使用时间，每次调用计算当前时间和上一次缓存时间来获取 cpu 使用，而不是 gopsutil 从头 sleep 统计。
用 psutil 包套 gopsutil 包，缓存部分变化很慢的信息，下次直接从内存读，无需 IO
修复 systemproxy 不变的 Bug

优化时间： 100ms -> 1 μs 左右

Please indicate you've done the following:

Made sure tests are passing and test coverage is added if needed.
Made sure commit message follow the rule of Conventional Commits specification.
Considered the docs impact and opened a new docs issue or PR with docs changes if needed.

copilot summary

This pull request refactors system resource monitoring in the agent by introducing a new psutil utility package that provides cached and optimized access to CPU, disk, and host information. The changes replace direct usage of gopsutil functions throughout the service layer with calls to the new psutil abstractions, improving performance and consistency in resource data retrieval. Key updates include changes to CPU and disk usage calculations, host info loading, and dashboard data aggregation.

System resource access refactoring:

Introduced the new agent/utils/psutil package, which provides thread-safe, cached access to CPU, disk, and host information through global state objects (CPU, CPUInfo, DISK, HOST). [1] [2] [3] [4]
Refactored all usages of gopsutil functions in agent/app/service/dashboard.go, agent/app/service/alert_helper.go, and agent/app/service/monitor.go to use the new psutil methods for CPU core counts, CPU usage, disk usage, and host info. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

Dashboard and monitoring logic improvements:

Updated dashboard and node info loading logic to use cached resource values and improved CPU usage sampling, including per-core statistics. [1] [2]
Enhanced disk info loading to perform usage sampling in a goroutine with timeout and cache results, increasing reliability and responsiveness.

Other enhancements:

Improved system proxy detection logic in dashboard service using the new cmp.Or method for environment variable selection.
Ensured consistent error handling and logging for resource sampling failures across services. [1] [2]

f2c-ci-robot · 2025-11-26T06:51:08Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

f2c-ci-robot · 2025-11-26T06:51:13Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign wanghe-fit2cloud for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

lan-yonghui · 2025-11-26T08:30:46Z

/lgtm

refactor: change psutil method

e98d9db

f2c-ci-robot bot added the do-not-merge/release-note-label-needed label Nov 26, 2025

f2c-ci-robot bot requested review from wanghe-fit2cloud and zhengkunwang223 November 26, 2025 06:51

clean up

c897cc9

f2c-ci-robot bot assigned lan-yonghui Nov 26, 2025

f2c-ci-robot bot added the lgtm label Nov 26, 2025

wanghe-fit2cloud merged commit 0a42d49 into 1Panel-dev:dev-v2 Nov 26, 2025
0 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: replace gopsutil sampling with cached metrics and optimize system info retrieval #11079

perf: replace gopsutil sampling with cached metrics and optimize system info retrieval #11079

Uh oh!

HynoR commented Nov 26, 2025 •

edited

Loading

Uh oh!

f2c-ci-robot bot commented Nov 26, 2025

Uh oh!

f2c-ci-robot bot commented Nov 26, 2025

Uh oh!

lan-yonghui commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

perf: replace gopsutil sampling with cached metrics and optimize system info retrieval #11079

perf: replace gopsutil sampling with cached metrics and optimize system info retrieval #11079

Uh oh!

Conversation

HynoR commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Summary of your change

Please indicate you've done the following:

copilot summary

Uh oh!

f2c-ci-robot bot commented Nov 26, 2025

Uh oh!

f2c-ci-robot bot commented Nov 26, 2025

Uh oh!

lan-yonghui commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HynoR commented Nov 26, 2025 •

edited

Loading