Skip to content

Feature to monitor client process system usage#617

Merged
seanses merged 22 commits intomainfrom
di/monitor-system-usage
Feb 27, 2026
Merged

Feature to monitor client process system usage#617
seanses merged 22 commits intomainfrom
di/monitor-system-usage

Conversation

@seanses
Copy link
Collaborator

@seanses seanses commented Jan 28, 2026

Introduces a client benchmark utility to track system resource usage (CPU, memory, disk I/O, and network I/O) of a process, so we don't need to write scripts to capture usage stats according to different OS standards. This becomes extremely helpful when I benchmark on Python notebook instances, e.g. Google Colab, where system monitor is not easily accessible or when running a separate monitor script is not easy.

Usage

Users can enable monitoring by setting HF_XET_SYSTEM_MONITOR_ENABLED to true, set usage sample interval using HF_XET_SYSTEM_MONITOR_SAMPLE_INTERVAL, this outputs metrics to the tracing stream at INFO level by default. In addition, these metrics can be redirected to a separate file by setting sample log path using HF_XET_SYSTEM_MONITOR_LOG_PATH. Example:

Screenshot 2026-02-10 at 6 21 51 PM

Output

The stats are output in JSON format, which can be queried using tools like jq, e.g.

  1. Trace of peak memory usage: jq '.memory.peak_used_bytes' [HF_XET_SYSTEM_MONITOR_LOG_PATH]
  2. Trace of disk write speed: jq '.disk.average_write_speed' [HF_XET_SYSTEM_MONITOR_LOG_PATH]
  3. Trace of network receive speed: jq '.network.average_rx_speed' [HF_XET_SYSTEM_MONITOR_LOG_PATH]

@seanses seanses force-pushed the di/monitor-system-usage branch from 5bb5294 to f964ae3 Compare January 28, 2026 03:25
@seanses seanses mentioned this pull request Feb 10, 2026
@seanses seanses requested a review from hoytak February 11, 2026 02:51
@seanses seanses requested a review from hoytak February 12, 2026 20:36
Copy link
Collaborator

@hoytak hoytak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; some possible adjustments but these can be done later.

@seanses seanses merged commit c4111eb into main Feb 27, 2026
7 checks passed
@seanses seanses deleted the di/monitor-system-usage branch February 27, 2026 21:36
hoytak pushed a commit that referenced this pull request Feb 28, 2026
Introduces a client benchmark utility to track system resource usage
(CPU, memory, disk I/O, and network I/O) of a process, so we don't need
to write scripts to capture usage stats according to different OS
standards. This becomes extremely helpful when I benchmark on Python
notebook instances, e.g. Google Colab, where system monitor is not
easily accessible or when running a separate monitor script is not easy.

# Usage #
Users can enable monitoring by setting `HF_XET_SYSTEM_MONITOR_ENABLED`
to true, set usage sample interval using
`HF_XET_SYSTEM_MONITOR_SAMPLE_INTERVAL`, this outputs metrics to the
tracing stream at `INFO` level by default. In addition, these metrics
can be redirected to a separate file by setting sample log path using
`HF_XET_SYSTEM_MONITOR_LOG_PATH`.

# Output #
The stats are output in JSON format, which can be queried using tools
like `jq`, e.g.
1. Trace of peak memory usage: `jq '.memory.peak_used_bytes'
[HF_XET_SYSTEM_MONITOR_LOG_PATH]`
2. Trace of disk write speed: `jq '.disk.average_write_speed'
[HF_XET_SYSTEM_MONITOR_LOG_PATH]`
3. Trace of network receive speed: `jq '.network.average_rx_speed'
[HF_XET_SYSTEM_MONITOR_LOG_PATH]`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants