Skip to content

honcho: implement health_check() with auth/network classification (#42 step 2b)#45

Merged
PowerCreek merged 1 commit into
mainfrom
honcho-health-check-override
May 23, 2026
Merged

honcho: implement health_check() with auth/network classification (#42 step 2b)#45
PowerCreek merged 1 commit into
mainfrom
honcho-health-check-override

Conversation

@PowerCreek

Copy link
Copy Markdown

Summary

Step 2b of the RFC #42 migration plan. Honcho already had a real backend probe inside doctor's elif block (get_honcho_client(hcfg) since the initial Honcho integration); this PR moves it INTO the provider's own health_check() so doctor's generic branch can eventually use the unified ABC interface (step 3).

Outcomes (RFC #42 reason prefixes)

Tuple When
(True, "") Handshake succeeds
(False, "no_config") honcho.json absent
(False, "disabled") enabled=false in config
(False, "no_credentials") Neither api_key nor base_url set (NousResearch#2645: either alone suffices)
(False, "sdk_missing") honcho-ai not installed
(False, "config_error: ...") Config parsing itself raised
(False, "auth: ...") 401/403/forbidden/unauthorized/invalid-api-key/authentication
(False, "unreachable: ...") Anything else

The "never raises" contract is enforced at every layer — even when HonchoClientConfig.from_global_config() itself throws, health_check returns (False, "config_error: ...") rather than propagating.

What this PR does NOT do

Doesn't collapse the dedicated honcho elif block in hermes_cli/doctor.py. That follows in step 3 of #42 once doctor's generic branch can replace the per-provider elif blocks entirely with _provider.health_check() calls. Operators see no doctor behavior change here.

Test plan

  • 10 new tests in tests/honcho_plugin/test_health_check.py:
    • successful handshake → (True, "")
    • no_config when honcho.json absent
    • disabled when enabled=false
    • no_credentials when both api_key and base_url empty
    • base_url alone counts as credentials (fix(honcho): support local baseUrl-only configuration NousResearch/hermes-agent#2645 port)
    • 4 auth-classification cases (401, 403, invalid api key, authentication failed)
    • ConnectionErrorunreachable:
    • "Never raises" contract: from_global_config exception → (False, "config_error: ...")
    • Long error messages truncated under 220 chars
  • No regression: pytest tests/honcho_plugin/test_health_check.py tests/plugins/memory/test_mem0_health_check.py tests/agent/test_memory_provider.py → 88 passed.

Continues #42; step 3 (doctor branch collapse) is next.

Filed by hermes-maintainer (PowerCreek).

…step 2b)

Step 2b of the RFC #42 migration plan. Honcho already had a real
backend probe inside doctor's elif block (get_honcho_client(hcfg)
since the initial Honcho integration); move it INTO the provider's
health_check() so doctor's generic branch can eventually use the
unified ABC interface (step 3).

Outcomes (RFC #42 reason prefixes):
  (True, "")                    — handshake succeeds
  (False, "no_config")          — honcho.json absent
  (False, "disabled")           — enabled=false in config
  (False, "no_credentials")     — neither api_key nor base_url set
                                  (NousResearch#2645: either one suffices)
  (False, "sdk_missing")        — honcho-ai not installed
  (False, "config_error: ...")  — config parsing itself raised
  (False, "auth: ...")          — 401/403/forbidden/unauthorized/
                                  invalid-api-key/authentication
  (False, "unreachable: ...")   — anything else (network, timeout,
                                  SDK exception)

The "never raises" contract is enforced at every layer — even
when HonchoClientConfig.from_global_config() itself throws,
health_check returns (False, "config_error: <msg>") rather than
propagating.

Doctor's dedicated honcho elif block is NOT touched yet. That
collapses in a follow-up PR (step 3 of #42) once both Honcho and
Mem0 (#44) ship native health_check overrides. Operators see no
behavior change here.

Tests:
  - successful handshake returns (True, "")
  - no_config when honcho.json absent
  - disabled when enabled=false
  - no_credentials when both api_key and base_url empty
  - base_url alone counts as credentials (NousResearch#2645 port)
  - 4 auth-classification cases (401/403/invalid-api-key/auth-failed)
  - ConnectionError → unreachable:
  - never raises: config-parsing exception → (False, "config_error:")
  - long error messages truncated under 220 chars

Continues #42; does not close it. Step 3 (doctor collapse) is
next.
@PowerCreek PowerCreek merged commit 38eb04b into main May 23, 2026
@PowerCreek PowerCreek deleted the honcho-health-check-override branch May 23, 2026 01:44
PowerCreek added a commit that referenced this pull request May 23, 2026
…ep 2c) (#46)

Step 2c of the RFC #42 migration plan. ByteRover's is_available()
only checked whether the brv CLI was on PATH — an operator with
brv installed but not logged in saw a false-positive green check
from doctor. health_check() now runs `brv status` to verify both:

  (True, "")                     — `brv status` exits 0.
  (False, "sdk_missing")         — brv CLI not on PATH.
  (False, "auth: ...")           — non-zero exit + stderr mentions
                                    auth / login / unauthorized /
                                    "not signed in" / 401 / 403.
  (False, "unreachable: timeout")— probe timed out.
  (False, "unreachable: ...")    — anything else (config issue,
                                    malformed install, etc.).

The probe has a 3-second timeout — `brv status` is local-cache
backed, so anything longer indicates a stuck CLI.

The "never raises" contract from the ABC default is preserved —
even if _run_brv itself propagates (it shouldn't; it has its own
try/except), the override catches the exception and returns
(False, "unreachable: <msg>").

Tests:
  - success returns (True, "")
  - sdk_missing when brv absent
  - 5 auth-classification cases (not-signed-in, Unauthorized,
    auth-required, 401, login required)
  - "timed out" → unreachable: timeout
  - generic non-zero exit → unreachable: <msg>
  - never-raises: even when _run_brv throws, returns tuple
  - long error messages truncated under 220 chars

Continues #42; does not close it. Three providers migrated:
mem0 (#44), honcho (#45), byterover (this PR). Six remain
before the doctor-branch collapse in step 3.
PowerCreek added a commit that referenced this pull request May 23, 2026
…s 2h + 3) (#52)

Closes #42. Final two steps of the RFC migration.

Step 2h — holographic health_check:
  Holographic is local-only (SQLite); no remote service. But a
  read-only HERMES_HOME (RO mount, wrong perms) would still
  break it at runtime. Override verifies db_path's parent
  directory is writable.

    (True, "")                     — parent mkdir + W_OK pass.
    (False, "unreachable: <msg>")  — mkdir failed or W_OK denied.
    (False, "config_error: <msg>") — get_hermes_home raised.

Step 3 — doctor branch collapse:
  Every shipped provider now implements health_check() with the
  RFC #42 reason-prefix taxonomy. The provider-specific Honcho +
  Mem0 elif blocks (~120 lines combined) are gone. The unified
  dispatch is one ~80-line block in doctor.py keyed on the prefix:

    healthy → check_ok("<name> reachable")
    no_api_key / no_credentials → _fail_and_issue (setup hint)
    no_url / no_endpoint        → _fail_and_issue (URL hint)
    no_config                   → check_warn (run setup)
    disabled                    → check_info
    sdk_missing                 → _fail_and_issue (plugin docs)
    auth:                       → _fail_and_issue (rotate key)
    not_found:                  → _fail_and_issue (verify base URL)
    http:                       → check_warn (unexpected status)
    unreachable:                → check_warn (transient hint)
    config_error:               → check_warn (config raised)
    health_check_raised:        → check_warn (provider bug; RFC
                                   #42 says health_check MUST NOT
                                   raise, so this is a contract
                                   violation worth flagging)
    other                       → check_warn (unknown verbatim)

Doctor's Memory Provider section is now ~80 lines instead of
~200, and adding the 9th provider requires zero doctor changes.

Migration table (8 providers total):
  mem0       — PR #44  GET /v1/memory profile
  honcho     — PR #45  get_honcho_client handshake
  byterover  — PR #46  brv status (CLI + login)
  supermemory — PR #48  client.profile probe
  openviking — PR #49  /health endpoint probe
  retaindb   — PR #50  /v1/memory/profile GET
  hindsight  — PR #51  mode-dependent (local import / cloud /version)
  holographic — this PR  db_path parent writability

Tests:
  - 6 new tests for holographic health_check
  - Updated 3 doctor tests to assert the new unified dispatch
    (auth → _fail_and_issue; unreachable → check_warn; healthy
    → "<name> reachable") instead of the now-removed elif-block
    output strings.
  - 228 health_check + doctor tests pass total.

The #42 RFC is now fully implemented across all shipped memory
providers. Closing the issue with this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant