Skip to content

doctor: probe devagentic graph when skills/memory graph mode is on#18

Merged
PowerCreek merged 1 commit into
mainfrom
doctor-devagentic-graph-probe
May 22, 2026
Merged

doctor: probe devagentic graph when skills/memory graph mode is on#18
PowerCreek merged 1 commit into
mainfrom
doctor-devagentic-graph-probe

Conversation

@PowerCreek

Copy link
Copy Markdown

Summary

Closes #17. Audited the skills + memory adapters for the same failure-loudness gap fixed in #15. The slash-command-style loudness fix doesn't translate — their _post_graphql correctly returns silent None because callers MUST keep file fallback (per docstring contract). The right diagnostic surface is hermes doctor, which previously made zero DEVAGENTIC-related checks.

Changes

  • hermes_cli/doctor.py: new _check_devagentic_graph() invoked from run_doctor. Sends one { __typename } probe and surfaces the specific failure kind.
  • Inert when both graph_enabled() flags are False (the byte-stable default).
  • No changes to the adapter modules — their loss-tolerant contract stays correct for actual callers.

Output examples

Graph mode off (default):

(no Devagentic Graph section at all)

Graph mode on, no API key set:

─── Devagentic Graph ───
ℹ Graph mode active for: skills, memory
✗ Devagentic GraphQL: auth failed
    set DEVAGENTIC_API_KEY (any non-empty value when devagentic runs
    in DEVAGENTIC_TRUST_HEADER=1 mode)

Graph mode on, working:

─── Devagentic Graph ───
ℹ Graph mode active for: skills
✓ Devagentic GraphQL reachable
    http://127.0.0.1:6071/graphql — auth + user_id OK

Test plan

  • 6 new tests in tests/hermes_cli/test_doctor_devagentic_graph.py:
    • silent when both modes disabled
    • reports unresolved USER_ID
    • reports auth failure (401)
    • reports not-found (404)
    • reports unreachable (URLError)
    • reports OK on 200 with both modes active
  • pytest tests/hermes_cli/test_doctor_devagentic_graph.py → 6 passed
  • pytest tests/hermes_cli/test_doctor*.py → 73 passed (no regression in existing 67 + 6 new)

Filed by hermes-maintainer (PowerCreek).

When DEVAGENTIC_SKILLS_GRAPH=1 or DEVAGENTIC_MEMORY_GRAPH=1, the
adapters silently fall back to file lookups on auth / network / DNS
failures (their loss-tolerant contract is correct). Operators who
opted into graph mode had no diagnostic surface to learn why
lookups weren't being consulted — `hermes doctor` made zero
DEVAGENTIC-related checks.

Add a `Devagentic Graph` section to run_doctor that sends a one-shot
`{ __typename }` probe when either flag is set, and surfaces the
specific failure kind:

  auth failed → set DEVAGENTIC_API_KEY
  not found → check DEVAGENTIC_BASE_URL points at graph-enabled instance
  unreachable → details from URLError/OSError
  USER_ID unresolved → set env or run inside a profile
  reachable → 200 OK with auth + user_id

Silent when both flags are False (preserves byte-stable default for
users who never wired devagentic).

Closes #17.
@PowerCreek PowerCreek merged commit 3998bdd into main May 22, 2026
@PowerCreek PowerCreek deleted the doctor-devagentic-graph-probe branch May 22, 2026 21:20
PowerCreek added a commit that referenced this pull request May 22, 2026
Auditing the gateway/cron/batch_runner surfaces for diagnostic
gaps analogous to the devagentic-graph probe (#17/#18). `hermes
doctor` made zero cron-related checks even when ~/.hermes/cron/
jobs.json is populated.

Two signals are now surfaced (same shape as cron_status, but in
the canonical "tell me what's wrong" surface):

1. Gateway PID — cron only fires when the gateway runs. When PIDs
   are absent and jobs are configured, doctor now fails with a
   pointer to `hermes gateway install`.
2. Recent failures — every job tracks last_status / last_error.
   Doctor warns on any job whose last_status is not in
   {ok, skipped, pending, ""}, and lists up to the first 5 with
   the failing job's name + last_run_at + truncated error.

Inert when jobs.json doesn't exist or contains no jobs — the
byte-stable default for users who never wired cron.

Closes #26.
PowerCreek added a commit that referenced this pull request May 23, 2026
…#39)

The same urllib HTTPError + URLError + OSError + TimeoutError
dispatch repeats in three sites after #15/#18/#20:

  plugins/devagentic-canvas/client.py:_request
  plugins/devagentic-docs/client.py:_post_graphql
  hermes_cli/doctor.py:_check_devagentic_graph

Each site has the same if-401/403-elif-404-else branches over the
exception and the same urllib network-error fallback. Only the
message text differs per site.

Add `utils.classify_http_error(exc) -> str` returning one of:
  HTTP_ERROR_AUTH        — 401 / 403
  HTTP_ERROR_NOT_FOUND   — 404
  HTTP_ERROR_HTTP        — other HTTPError status
  HTTP_ERROR_UNREACHABLE — URLError / OSError / TimeoutError
  HTTP_ERROR_UNKNOWN     — anything else

Refactor the three callsites onto a single `except (URLError, OSError,
TimeoutError)` (HTTPError is a URLError subclass) followed by a
dispatch on the classifier output. Each site still owns its own
message text — the user-facing strings are unchanged.

Behavior change: none. Net diff is roughly even (helper + dispatch
replaces three near-identical if/elif/else blocks).

Tests:
  - 11 new tests in tests/test_utils_classify_http_error.py covering
    401, 403, 404, generic statuses (400/422/429/500/502/503/504),
    URLError, OSError, TimeoutError, socket.timeout, unrelated
    exceptions, and the HTTPError-is-URLError subclass quirk.
  - All three refactored callsites' existing test suites still pass:
    canvas (47), docs (47), doctor devagentic-graph (6) — 112 total.

Closes #38.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[synergy] hermes doctor doesn't probe devagentic graph even when DEVAGENTIC_*_GRAPH=1

1 participant