feat(tui): interactive filter query ('/') and threshold alerts#196
Merged
Conversation
Add two tightly-coupled TUI capabilities for local and remote modes: Filter DSL (`/` key) - New `src/ui/filter_dsl` module with lexer, parser, and evaluator. - Fields: temp, util, mem_pct, mem_used, mem_total, power, user, host, gpu_name, driver, index, uuid, pstate, numa, device_type. - Numeric ops `>`, `>=`, `<`, `<=`, `==`, `!=`; string ops `==`, `!=`, and regex `~=` (size-limited to 128 KiB via RegexBuilder). - Logical `&`, `|`, and parenthesised sub-expressions. - Unknown fields raise a parse error; fields absent on the device cause the row to fail closed so mixed views stay readable. - `DeviceRowView` trait implemented on `GpuInfo`, `ProcessInfo`, and `CpuInfo` so renderers can call `apply_filter(&state.filter_query, row)` uniformly. Filter UX - `/` opens a status-bar buffer with live `[matched X of Y]` preview. - `Enter` commits; `ESC` clears or aborts; `Ctrl-R` recalls the last five queries. - Invalid queries show an inline red `parse error: ... at col N` message without crashing or committing. - Non-matching rows render at 40% opacity via a post-processing `dim_ansi` helper so no renderer signature had to change. Threshold alerts - New `src/ui/alerts.rs` with per-device per-rule state machines and hysteresis (`hysteresis_c`). - Rules: temperature (`temp_warn_c` / `temp_crit_c`), sustained idle utilization (`util_idle_warn_mins`), and power (`power_crit_w`). - Transitions produce toast notifications (5 s via `AppConfig::NOTIFICATION_DURATION_SECS`), 1 Hz border flash on the affected GPU tile, and entries in a 50-slot ring buffer. - `A` toggles the alert history panel. - Optional fire-and-forget webhook POST (`reqwest`, 2 s timeout, bounded `mpsc` queue with drop-oldest-on-full) via new `src/network/webhook.rs`. Config + CLI - `AlertConfig` in `src/common/config.rs` with defaults matching the issue's `[alerts]` sketch. - CLI overrides `--alert-temp` and `--alert-util-low-mins` on both `local` and `view` subcommands; the TOML loader can layer on top when the companion config-file issue lands. Tests - 62 parser tests + 23 eval tests covering all field types, hysteresis boundaries (entering/leaving crit), regex safety, and the webhook body shape (inline + `tests/alert_webhook_test.rs`). - Event-handler tests for the filter input state machine (`/` opens, `ESC` aborts, `Enter` commits, `Ctrl-R` recalls, `q` is literal text while editing). Closes #186
- alerts: when temp_warn_c=0 and temp_crit_c>0, a device in Crit that cools below crit_off now transitions straight to Ok instead of passing through a spurious Warn level. warn_off would otherwise be negative (0 - hysteresis_c), making `temp <= warn_off` almost never true. Add regression test `warn_disabled_crit_enabled_recovers_straight_to_ok`. - filter_dsl/eval: gate mem_total_field on total_memory > 0 for symmetry with mem_pct_field. Prevents mem_total==0 from matching zero-total devices (fail-closed). Add `mem_total_absent_when_zero_fails_closed`. - webhook: update module and `enqueue` doc comments to reflect actual behavior (try_send drops the newest payload on overflow, not oldest). No behavior change; the UI-never-blocks invariant is preserved.
Addresses pr-security-checker findings on PR #196: CRITICAL - webhook: disable HTTP redirects (Policy::none) to prevent SSRF pivot through attacker-controlled 3xx responses from operator-configured hosts HIGH - webhook: redact userinfo from URL before logging to avoid credential leak when operators accidentally configure basic-auth URLs - webhook: document that config reload must rebuild DataCollector (OnceLock binds URL at first init) MEDIUM - alerts: garbage-collect states HashMap for devices that disappear from the snapshot (decommission / UUID churn) to prevent slow leak - data_collector: bell write moved to spawn_blocking to avoid stalling Tokio executor on slow terminals - filter_dsl: add DFA size limit (1 MiB) on regex compilation - event_handler: cap filter_buffer at 512 chars to block UI DoS via bracketed-paste of large blobs - filter_dsl: add 16 KiB lexer input cap as defense-in-depth for non-interactive callers - renderers/dim: preserve background color SGR codes so selected-row and alert-flash highlights survive the filter-dim pass
- Add truecolor RGB background preservation tests for dim_ansi (covers the `48;2;r;g;b` code path documented but not previously tested) - Add regression test for filter buffer cap at FILTER_BUFFER_MAX (512) to guard the DoS-prevention truncation in the editing event handler - Document webhook SSRF protection (redirects disabled) and filter query buffer limit (512 chars / 16 KiB lexer gate) in README Filtering & Alerts section
Member
Author
PR Finalization CompleteSummaryTests: Added 3 new tests covering previously-untested code paths:
Documentation: Updated README
Help pane: Lint/Format: CHANGELOG: No CHANGELOG file exists in this repo — skipped. Final test counts (all passing)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds two tightly-coupled TUI capabilities scoped to local and remote modes.
/key):temp>85,util<5 & host~=dgx,user==alice | power>400, and similar. Fields, numeric comparisons,string equality, size-bounded regex (
~=),&/|, and parentheses.Unknown fields are parse errors; fields absent on the row fail closed.
Non-matching rows are dimmed via
DarkGrey(hidable via config).[alerts]config with defaults(
temp_warn_c=80,temp_crit_c=90,util_idle_pct=5,util_idle_warn_mins=15,power_crit_w=0,hysteresis_c=2). Per-device per-rule hysteresis state machines emit
AlertTransitionrecords, which become toast notifications, a 1 Hzborder flash on the affected tile, and entries in a 50-slot ring
buffer. Optional async webhook POST (2 s timeout, bounded mpsc queue,
non-blocking: new alerts are dropped if the worker queue is saturated).
Atoggles the history panel.Implementation notes
src/ui/filter_dsl/{mod,lexer,parser,eval}.rs,src/ui/alerts.rs,src/network/webhook.rs,src/ui/renderers/dim.rs.AppStategainsfilter_query,filter_buffer,filter_input_mode,filter_recent,alerter,alert_history,and
alert_panel_open— all cloned intoRenderSnapshotso therender path stays lock-free.
event_handlerintercepts every key whileFilterInputMode::Editingso thatq,d,u,fetc. becomeliteral text until the operator commits (
Enter) or aborts (ESC).print_colored_textcall bypost-processing the rendered bytes:
dim_ansireplaces every SGRforeground with
\x1b[90m(DarkGrey) while preservingcursor-movement CSI sequences. This keeps existing renderers
untouched.
data_collectorrunsAlerter::evaluateon the GPU snapshot afterevery successful collection, pushes transitions to the notification
manager + ring buffer, optionally emits a bell (
\x07), andfire-and-forget POSTs to the webhook channel.
--alert-tempand--alert-util-low-minson bothlocalandview. Compiled defaults apply when no config file is present.RegexBuilder::size_limit(128 * 1024); over-sizedpatterns surface as
parse error: invalid regex: ....Testing
operator, logical combinations, whitespace tolerance, precedence,
and failure modes (unknown field, type mismatch, unterminated
parens, oversized regex, invalid regex, trailing garbage).
semantics,
GpuInfo,ProcessInfo, andCpuInfoimpls.ok->warn,warn->crit, notransition inside the hysteresis band, recovery to
ok, zerothresholds disabling the rule, and
crit->okdirect recovery whenthe warn rule is disabled.
tests/alert_webhook_test.rs): minimaltokio TCP server captures the body, asserts the serialized JSON
matches the field shape from the issue.
/enters edit mode; characters append;Entercommits a valid filter and adds to recent; invalid querykeeps edit mode and surfaces an error;
ESCaborts;qis literaltext while editing;
Ctrl-Rrecalls the newest entry;Atogglesthe alert panel;
ESCcloses it.cargo build,cargo test --lib(678),cargo test --bin all-smi(737),
cargo clippy --all-targets -- -D warnings, andcargo fmt --all -- --checkall pass.Closes #186