Summary
Add two HTTP endpoints to all-smi api mode on top of the existing axum server:
GET /events — Server-Sent Events stream. Emits a JSON payload per collection cycle. Targets embedded dashboards and Tauri/Electron apps that want live updates without polling /metrics.
GET /snapshot — One-shot JSON response. Same schema as a single SSE frame. Convenient HTTP pair for the all-smi snapshot CLI.
Both reuse the exact JSON schema shared by the snapshot subcommand and the record format (schema: 1).
Motivation
The current /metrics Prometheus endpoint serves its purpose for time-series scraping, but embedded UIs need lower latency and easier client code. Implementing SSE on the already-present axum stack is cheap (SSE is just text/event-stream with a keep-alive) and broadens the set of integrations (custom React/Vue/Lit dashboards, Tauri desktop apps, Grafana panel plugins that prefer JSON, simple debugging with curl -N). Keeping the JSON schema identical to snapshot and record means one format, three delivery channels.
Current state
axum + tower-http in Cargo.toml.
src/api/server.rs wires the /metrics route.
- Collection loop in
src/api/mod.rs / src/api/server.rs polls readers every --interval seconds.
GpuInfo, CpuInfo, etc. all implement Serialize — no new derives needed.
Proposed design
Endpoints
GET /snapshot
- Content-Type:
application/json.
- Body: the JSON frame produced by
snapshot subcommand (single object with schema, timestamp, gpus, cpus, …).
- Query params:
?include=gpu,cpu,memory,chassis,process,storage (default gpu,cpu,memory,chassis).
?pretty=1 (default off for HTTP).
- Caching:
Cache-Control: no-store.
- Response time bound: must complete within one collection cycle + 200 ms.
GET /events
- Content-Type:
text/event-stream.
- Emits one event per collection cycle:
event: snapshot
id: 42
data: {"schema":1,"timestamp":"...","gpus":[...], "cpus":[...], ...}
- Keep-alive: emit
: keep-alive\n\n comment every 30 s (matching HTTP2_KEEPALIVE_SECS config).
- Query params:
?include=... — same semantics as /snapshot.
?throttle=N — emit at most every N seconds (cannot be smaller than the collection interval).
?heartbeat=N — override heartbeat interval.
Last-Event-ID header support: if client reconnects with a known ID, we simply resume with the next live frame (no replay from history — all-smi doesn't store history).
- CORS: respect the existing
tower-http::cors configuration; GET + Accept: text/event-stream must be allowed.
Broadcast architecture
- Single
tokio::sync::broadcast::channel<Arc<SnapshotFrame>> with buffer 16.
- The existing API collection task sends one
Arc<SnapshotFrame> per tick.
- Each SSE client is a receiver. Lagging receivers get the oldest frames dropped (broadcast semantics): in that case, emit an
event: lag\ndata: {"dropped": N}\n\n event and continue.
- Drop of client connection propagates via
receiver.next() returning None — release resources.
Schema
Identical to the snapshot CLI JSON schema — must share serialization code. Place a SnapshotFrame struct in src/common/snapshot.rs (or re-export from the snapshot module), both subcommands consume it.
Implementation plan
Files to add / modify:
src/api/server.rs — add routes /events and /snapshot; spawn a singleton collection task that broadcasts SnapshotFrames into a broadcast::Sender.
- New
src/api/handlers/events.rs — SSE handler implementing keep-alive, backpressure, lag notification, and the include filter.
- New
src/api/handlers/snapshot.rs — one-shot JSON handler. Reads the last broadcast frame; if stale beyond 2×interval, forces a fresh collection.
src/api/mod.rs — export the shared SnapshotFrame state.
- Shared module
src/common/snapshot.rs — SnapshotFrame, FrameBuilder::new().with_include(...).build(). Used by snapshot CLI and by these endpoints.
Cargo.toml — axum feature sse if not already enabled; tokio-stream if needed.
- Example client in
examples/sse_client.html — small HTML+JS using EventSource.
Acceptance criteria
Edge cases & non-goals
- SSE is HTTP/1.1 streaming. Reverse proxies (nginx, haproxy) may buffer — document that users should set
X-Accel-Buffering: no / proxy_buffering off. Emit the header from our side (X-Accel-Buffering: no).
- Heartbeat interval must be smaller than typical proxy idle timeouts (default 30 s should be safe; allow override).
- The
processes section is expensive — only populate when requested via ?include=process to preserve /events default cheapness.
- Non-goal: WebSocket transport. SSE is simpler, one-way, and sufficient.
- Non-goal: HTTP/2 server push. SSE works fine over HTTP/1.1 and 2.
- Non-goal: historical replay of missed frames. Clients missing a frame get the next live one.
Soft dependency
- Shares
SnapshotFrame with the snapshot CLI (dependency) and the record format. Land the shared schema first; ship these three features on a common foundation.
Summary
Add two HTTP endpoints to
all-smi apimode on top of the existingaxumserver:GET /events— Server-Sent Events stream. Emits a JSON payload per collection cycle. Targets embedded dashboards and Tauri/Electron apps that want live updates without polling/metrics.GET /snapshot— One-shot JSON response. Same schema as a single SSE frame. Convenient HTTP pair for theall-smi snapshotCLI.Both reuse the exact JSON schema shared by the
snapshotsubcommand and therecordformat (schema: 1).Motivation
The current
/metricsPrometheus endpoint serves its purpose for time-series scraping, but embedded UIs need lower latency and easier client code. Implementing SSE on the already-presentaxumstack is cheap (SSE is justtext/event-streamwith a keep-alive) and broadens the set of integrations (custom React/Vue/Lit dashboards, Tauri desktop apps, Grafana panel plugins that prefer JSON, simple debugging withcurl -N). Keeping the JSON schema identical tosnapshotandrecordmeans one format, three delivery channels.Current state
axum+tower-httpinCargo.toml.src/api/server.rswires the/metricsroute.src/api/mod.rs/src/api/server.rspolls readers every--intervalseconds.GpuInfo,CpuInfo, etc. all implementSerialize— no new derives needed.Proposed design
Endpoints
GET /snapshotapplication/json.snapshotsubcommand (single object withschema,timestamp,gpus,cpus, …).?include=gpu,cpu,memory,chassis,process,storage(defaultgpu,cpu,memory,chassis).?pretty=1(default off for HTTP).Cache-Control: no-store.GET /eventstext/event-stream.: keep-alive\n\ncomment every 30 s (matchingHTTP2_KEEPALIVE_SECSconfig).?include=...— same semantics as/snapshot.?throttle=N— emit at most every N seconds (cannot be smaller than the collection interval).?heartbeat=N— override heartbeat interval.Last-Event-IDheader support: if client reconnects with a known ID, we simply resume with the next live frame (no replay from history — all-smi doesn't store history).tower-http::corsconfiguration;GET+Accept: text/event-streammust be allowed.Broadcast architecture
tokio::sync::broadcast::channel<Arc<SnapshotFrame>>with buffer 16.Arc<SnapshotFrame>per tick.event: lag\ndata: {"dropped": N}\n\nevent and continue.receiver.next()returningNone— release resources.Schema
Identical to the
snapshotCLI JSON schema — must share serialization code. Place aSnapshotFramestruct insrc/common/snapshot.rs(or re-export from the snapshot module), both subcommands consume it.Implementation plan
Files to add / modify:
src/api/server.rs— add routes/eventsand/snapshot; spawn a singleton collection task that broadcastsSnapshotFrames into abroadcast::Sender.src/api/handlers/events.rs— SSE handler implementing keep-alive, backpressure, lag notification, and theincludefilter.src/api/handlers/snapshot.rs— one-shot JSON handler. Reads the last broadcast frame; if stale beyond 2×interval, forces a fresh collection.src/api/mod.rs— export the sharedSnapshotFramestate.src/common/snapshot.rs—SnapshotFrame,FrameBuilder::new().with_include(...).build(). Used bysnapshotCLI and by these endpoints.Cargo.toml—axumfeaturesseif not already enabled;tokio-streamif needed.examples/sse_client.html— small HTML+JS usingEventSource.Acceptance criteria
curl -N http://localhost:9090/eventsstreams JSON events at the configured interval.curl -N 'http://localhost:9090/events?include=gpu'streams events with only thegpuskey populated.curl http://localhost:9090/snapshotreturns a single JSON object matching the snapshot schema.curl 'http://localhost:9090/snapshot?include=cpu,memory'returns a subset of keys.lagevent and resume.lsofor tokio-console)./metrics) also works for/eventsand/snapshot.examples/sse_client.htmlopens in a browser and shows live updates.cargo testintegration test spins up an api server on a random port, connects an EventSource client, asserts at least 3 frames within 5 seconds.Edge cases & non-goals
X-Accel-Buffering: no/proxy_buffering off. Emit the header from our side (X-Accel-Buffering: no).processessection is expensive — only populate when requested via?include=processto preserve/eventsdefault cheapness.Soft dependency
SnapshotFramewith thesnapshotCLI (dependency) and therecordformat. Land the shared schema first; ship these three features on a common foundation.