Skip to content

fix(kanban): make diagnostics --severity an "at or above" threshold#26384

Closed
briandevans wants to merge 2 commits into
NousResearch:mainfrom
briandevans:fix/kanban-severity-threshold-26379
Closed

fix(kanban): make diagnostics --severity an "at or above" threshold#26384
briandevans wants to merge 2 commits into
NousResearch:mainfrom
briandevans:fix/kanban-severity-threshold-26379

Conversation

@briandevans

Copy link
Copy Markdown
Contributor

Summary

hermes kanban diagnostics --severity X is documented as "at or above X" but the filter was implemented as an exact-match equality, silently dropping higher-severity diagnostics. Replace with a SEVERITY_ORDER-index threshold comparison so the implementation matches the documented (and intuitive) semantics.

Fixes #26379.

The bug

hermes_cli/kanban.py:1406 (before):

kept = [d for d in diags_by_task[tid] if d.severity == sev]

Argparse help text (hermes_cli/kanban.py:360) and the issue reporter's expectation:

"Only show diagnostics at or above this severity"

Live reproduction from the issue, on a board with two error-severity diagnostics:

Filter Returned Expected
(none) 2 2
--severity warning 0 2
--severity error 2 2
--severity critical 0 0

warning returned 0 — the silent failure mode. Watchdog cron scripts that use --severity warning as a catch-all gate get false-confidence runs while real error/critical issues go undetected.

The fix

hermes_cli/kanban_diagnostics.py already defines the ordering and uses it for sort/highest:

SEVERITY_ORDER = ("warning", "error", "critical")

The filter now uses the same constant to compute a threshold index. argparse(choices=["warning", "error", "critical"]) already constrains the flag, and the inner d.severity in SEVERITY_ORDER guard tolerates any future rule-produced severity that isn't in the canonical tuple without crashing the filter (same pattern used by severity_of_highest()).

Test plan

  • Focused regression test: tests/hermes_cli/test_kanban_cli.py::test_diagnostics_severity_filter_is_threshold_not_exact_match — parametrized over (no filter | warning | error | critical), asserts kept severities match the threshold semantics.
  • Edge case: test_diagnostics_severity_filter_drops_task_when_no_match — task whose only diagnostic is below the requested threshold is dropped from output (preserves the existing "no match → drop task" behavior).
  • Regression guard: with the original exact-match code, the --severity warning and --severity error cases fail (because they expect higher-severity diags to be included). With the threshold fix, all 5 pass.
  • Adjacent suite: tests/hermes_cli/test_kanban_cli.py + tests/hermes_cli/test_kanban_diagnostics.py (73 tests). 72 pass; 1 unrelated failure (test_run_slash_missing_required_arg_friendly_error) reproduces on clean origin/main and is not in the touched code.

Notes for reviewers

Single-file production change (~10 lines including a why-comment) plus regression tests. The severity in SEVERITY_ORDER guard inside the comprehension is intentional defense — rule authors could in principle ship a Diagnostic(severity="info") and that should silently skip a --severity warning threshold rather than ValueError-crash the CLI.

The `--severity` flag for `hermes kanban diagnostics` is documented (and
its argparse help text says) "Only show diagnostics at or above this
severity", but the filter was implemented as an exact-match equality:

    kept = [d for d in diags_by_task[tid] if d.severity == sev]

This silently dropped higher-severity diagnostics, so `--severity warning`
returned 0 diagnostics on a board with two `error`-severity findings.
Operators relying on `--severity warning` as a catch-all watchdog threshold
got false-confidence silent runs while real `error`/`critical` issues went
unsurfaced.

Fix: compare via `kanban_diagnostics.SEVERITY_ORDER` index so a request at
level X also keeps diagnostics with severity > X. `argparse(choices=...)`
already constrains the flag to a valid level, and the `severity in
SEVERITY_ORDER` guard tolerates any future unknown rule-produced severity
without crashing the filter.

Regression test: parametrized over no-filter / warning / error / critical,
asserting the kept set matches the threshold semantics. Also verifies
tasks whose diagnostics are all below threshold are dropped from output
(unchanged behavior). With the old exact-match code, the `warning` and
`error` cases fail; the new threshold makes them pass.

Fixes NousResearch#26379

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 15, 2026 14:13

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes a bug in /kanban diagnostics --severity where the filter performed exact-match instead of the documented "at or above" threshold semantics.

Changes:

  • Replace equality check with index-based threshold comparison against kd.SEVERITY_ORDER.
  • Add parametrized regression tests covering threshold behavior across severity levels.
  • Add test verifying tasks with no matching diagnostics are dropped.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
hermes_cli/kanban.py Switches severity filter from exact equality to threshold using SEVERITY_ORDER.index.
tests/hermes_cli/test_kanban_cli.py Adds regression tests for threshold filtering behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread hermes_cli/kanban.py
Comment on lines +1410 to +1414
kept = [
d for d in diags_by_task[tid]
if d.severity in kd.SEVERITY_ORDER
and kd.SEVERITY_ORDER.index(d.severity) >= threshold
]
Comment thread tests/hermes_cli/test_kanban_cli.py Outdated
Comment on lines +457 to +459
if not expected_severities:
assert payload == []
return
Comment thread tests/hermes_cli/test_kanban_cli.py Outdated
Comment on lines +420 to +421
import re
tid = re.search(r"(t_[a-f0-9]+)", out).group(1)
Comment thread tests/hermes_cli/test_kanban_cli.py Outdated
Comment on lines +472 to +473
import re
tid = re.search(r"(t_[a-f0-9]+)", out).group(1)
@daimon-nous daimon-nous Bot added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard labels May 15, 2026
- Precompute a {severity: rank} dict once instead of repeated
  O(n) `kd.SEVERITY_ORDER.index()` calls per diagnostic; use
  `severity_rank.get(d.severity, -1)` so unknown severities are
  handled explicitly rather than via membership pre-check.
- Move `import re` and `from hermes_cli import kanban_diagnostics as kd`
  to the top of the test module instead of inline inside helpers.
- Remove the unreachable `if not expected_severities` branch in
  `test_diagnostics_severity_filter_is_threshold_not_exact_match` —
  no parametrized case has an empty `expected_severities` set, and
  the empty-set behavior is already covered by
  `test_diagnostics_severity_filter_drops_task_when_no_match`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@briandevans

Copy link
Copy Markdown
Contributor Author

@copilot All findings addressed in commit 08d09c1:

  • Precomputed severity rank dict (kanban.py): replaced the O(n) kd.SEVERITY_ORDER.index() calls inside the comprehension with a single severity_rank = {s: i for i, s in enumerate(kd.SEVERITY_ORDER)} lookup, using .get(d.severity, -1) so unknown-severity values are handled explicitly without the redundant in SEVERITY_ORDER pre-check.
  • Removed unreachable test branch: dropped the dead if not expected_severities path in test_diagnostics_severity_filter_is_threshold_not_exact_match — the empty-set case was already covered by test_diagnostics_severity_filter_drops_task_when_no_match.
  • Moved inline imports (import re, from hermes_cli import kanban_diagnostics as kd) to the top of the test module so they're not repeated inside helpers and tests.

@briandevans

Copy link
Copy Markdown
Contributor Author

CI audit — all 7 test failures are pre-existing baselines on clean origin/main (9fb40e6a3). Zero failures are in touched code (hermes_cli/kanban.py, tests/hermes_cli/test_kanban_cli.py).

Test Symptom Root cause on main
test_provider_parity.py::TestDeveloperRoleSwap::test_developer_role_via_nous_portal ValueError: Model has a context window of 15,000 tokens, which is below the minimum 64,000 Test constructs AIAgent(provider="nous", base_url="https://inference-api.nousresearch.com/v1") — the unmocked context-length probe caches 15,000 and trips the new 64K minimum guard in run_agent.py:2349.
test_provider_parity.py::TestBuildApiKwargsNousPortal::test_includes_nous_product_tags same same
test_provider_parity.py::TestBuildApiKwargsNousPortal::test_uses_chat_completions_format same same
test_discord_adapter.py::TestMentionStrippedCommandDispatch::test_mention_then_command TypeError: catching classes that do not inherit from BaseException is not allowed gateway/platforms/discord.py:3730 does except discord.Forbidden:, but the e2e harness stubs discord as a SimpleNamespace, so discord.Forbidden resolves to a non-exception sentinel.
test_discord_adapter.py::TestMentionStrippedCommandDispatch::test_nickname_mention_then_command same same
test_discord_adapter.py::TestMentionStrippedCommandDispatch::test_text_before_command_not_detected same same
test_discord_adapter.py::TestAutoThreadingPreservesCommand::test_command_detected_after_auto_thread same same

All seven reproduce on clean origin/main locally with identical error text.

@briandevans

Copy link
Copy Markdown
Contributor Author

Closing — superseded by @shellybotmoyer's #26661, which landed as 1eadb06 on 2026-05-17 with the same documented "at or above" semantics fix. Their fix is in main; mine is no longer needed. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

hermes kanban diagnostics --severity filter is exact-match, not "at or above" as documented (kanban.py:1406)

2 participants