A terminal user interface (TUI) for monitoring Slurm clusters. Built with Textual for DGX H100 clusters.
⚠️ Development Notice: This project is mainly implemented by LLM (Sonnet 4/GPT-4) and is not complete, has bugs. Contributions are welcome, including major changes.
- Job monitoring with live updates
- Node status display with GPU availability
- GPU count display with partition info
- Script viewer with syntax highlighting
- Output tracking (stdout/stderr)
- Search and filtering
- Tabbed TUI interface
- Keyboard shortcuts
- gpustat-web integration for real-time GPU monitoring
# 1. Use uvx
$ uvx --from git+https://github.com/MilkClouds/smon.git smon
# 2. Use uv tool
$ uv tool install git+https://github.com/MilkClouds/smon.git
$ smon$ pip install git+https://github.com/MilkClouds/smon.git
$ smonsmonsmon --help # Show help
smon --refresh 10 # Set refresh interval to 10 seconds
smon --user alice # Filter jobs by user
smon --partition gpu # Filter jobs by partition
smon --gpustat-web URL # Enable gpustat-web integration| Key | Action |
|---|---|
q |
Quit application |
r |
Refresh data |
/ |
Focus search input |
f |
Show filter status |
s |
Open script modal for selected job |
o |
Open output modal for selected job |
t |
Toggle real-time output refresh |
Ctrl+R |
Refresh output in current tab |
- Job information: JobID, User, State, Partition, Resources
- GPU/CPU/memory usage and timing
- Select job to view details, script, and output
- Shows script for selected job
- Bash syntax highlighting
- Modal view with
skey
- stdout/stderr for selected jobs
- Real-time refresh toggle (
t) - Manual refresh (
Ctrl+R)
- Node status and availability
- GPU/CPU/memory per node
- gpustat-web integration (side-by-side view)
smon can display real-time GPU status from gpustat-web alongside the Slurm node information.
-
Make sure gpustat-web is running on your cluster (e.g.,
http://10.50.0.111:48109/) -
Run smon with the
--gpustat-weboption:smon --gpustat-web http://10.50.0.111:48109/
-
Or add to config file (
~/.config/smon/config.json):{ "gpustat_web_url": "http://10.50.0.111:48109/" }
The Nodes tab will show the Slurm node table on the left and live GPU status from gpustat-web on the right.
- Python ≥ 3.11
- Slurm cluster with
squeue,sinfo, andscontrolcommands - Terminal with color support
This TUI project is primarily implemented using LLM assistance (Sonnet 4/GPT-4) and is incomplete with known bugs. Contributions are welcome:
- Bug fixes
- Feature improvements
- Code refactoring
- Documentation
- Major changes
- Testing
Feel free to open issues or submit pull requests.