Skip to content

NordicHPC/sonar

Repository files navigation

image image

Sonar

Sonar is a tool to profile usage of HPC resources by regularly sampling processes, jobs, accelerators, nodes, queues, and clusters.

Sonar examines /proc and /sys and/or runs diagnostic programs, filters and groups the information, and prints it to stdout, stores it in a local directory tree, or sends it to a remote collector.

Sonar proper is GPL-3 but some side components that are crucial for the interaction with other tools that might not be GPL carry the MIT license.

image of a fish swarm

Image: Midjourney, CC BY-NC 4.0

Documentation

Start by reading the user manual, which explains most things about what it can do and how you make it do it.

For a deeper dive into how it works, try the design document.

To build it, or to modify it, try the developer document.

A sample deployment of Sonar on a cluster and a data aggregator on a backend is outlined in doc/HOWTO-DEPLOY.md.

Collecting and analyzing the data

Sonar's output data are rigorously specified and you can build your own data collectors, post-processors and analyses, but you can also use these existing tools (both under active development):

  • JobAnalyzer allows Sonar logs to be queried and analyzed, and provides dashboards, interactive and batch queries, and reporting of system activity, policy violations, hung jobs, and more.
  • Slurm-monitor is complementary to JobAnalyzer and focuses on managing and analyzing slurm queues and clusters, and has a benchmarking facility and other tools for job placement.

Authors

Similar and related tools

Sonar's original vision was to be a very simple, lightweight tool that did some basic things fairly cheaply and produced easy-to-process output for subsequent scripting. Sonar is no longer that: with GPU integration, SLURM integration, Kafka exfiltration, memory-resident modes, structured output, continual focus on performance and several elaborate backends, it is becoming as complex as the tools it was intended to replace or compete with.

Here are some of those tools:

About

Tool to profile usage of HPC resources by regularly probing processes.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors