mem0: implement health_check() with auth/network classification (#42 step 2a)#44
Merged
Merged
Conversation
…step 2a) Step 2a of the RFC #42 migration plan. Mem0 already had a real backend probe inside doctor's elif block (PR #35); move it INTO the provider's health_check() so doctor's generic branch can eventually use the unified ABC interface (step 3). The implementation is functionally identical to PR #35's doctor block, but returns the RFC-conventioned tuple instead of calling check_ok / check_fail / check_warn directly: (True, "") — get_all(user_id, limit=1) succeeds. (False, "no_api_key") — MEM0_API_KEY not set. (False, "sdk_missing") — mem0ai not installed. (False, "auth: <msg>") — 401/403/forbidden/unauthorized/ invalid-api-key/authentication. (False, "unreachable:...")— any other exception during get_all. (False, "config_error:..")— _load_config() itself raised. The "never raises" contract from the ABC default is preserved — config_error catches the rare case where even loading the config fails. This PR DOES NOT collapse the dedicated mem0 elif block in doctor.py yet — that comes in a later PR after Honcho also migrates. Operators see no behavior change here; the new method is just an alternative entry point with the same logic. Tests: - success returns (True, "") - no_api_key when MEM0_API_KEY empty - sdk_missing when `mem0` not importable - 4 auth-classification cases (401, 403, "invalid api key", "authentication failed") - ConnectionError classified as unreachable: - long error messages truncated under 220 chars - never raises: even when _load_config itself throws, health_check returns (False, "config_error: ...") Continues #42; does not close it.
PowerCreek
added a commit
that referenced
this pull request
May 23, 2026
…step 2b) (#45) Step 2b of the RFC #42 migration plan. Honcho already had a real backend probe inside doctor's elif block (get_honcho_client(hcfg) since the initial Honcho integration); move it INTO the provider's health_check() so doctor's generic branch can eventually use the unified ABC interface (step 3). Outcomes (RFC #42 reason prefixes): (True, "") — handshake succeeds (False, "no_config") — honcho.json absent (False, "disabled") — enabled=false in config (False, "no_credentials") — neither api_key nor base_url set (NousResearch#2645: either one suffices) (False, "sdk_missing") — honcho-ai not installed (False, "config_error: ...") — config parsing itself raised (False, "auth: ...") — 401/403/forbidden/unauthorized/ invalid-api-key/authentication (False, "unreachable: ...") — anything else (network, timeout, SDK exception) The "never raises" contract is enforced at every layer — even when HonchoClientConfig.from_global_config() itself throws, health_check returns (False, "config_error: <msg>") rather than propagating. Doctor's dedicated honcho elif block is NOT touched yet. That collapses in a follow-up PR (step 3 of #42) once both Honcho and Mem0 (#44) ship native health_check overrides. Operators see no behavior change here. Tests: - successful handshake returns (True, "") - no_config when honcho.json absent - disabled when enabled=false - no_credentials when both api_key and base_url empty - base_url alone counts as credentials (NousResearch#2645 port) - 4 auth-classification cases (401/403/invalid-api-key/auth-failed) - ConnectionError → unreachable: - never raises: config-parsing exception → (False, "config_error:") - long error messages truncated under 220 chars Continues #42; does not close it. Step 3 (doctor collapse) is next.
2 tasks
PowerCreek
added a commit
that referenced
this pull request
May 23, 2026
…ep 2c) (#46) Step 2c of the RFC #42 migration plan. ByteRover's is_available() only checked whether the brv CLI was on PATH — an operator with brv installed but not logged in saw a false-positive green check from doctor. health_check() now runs `brv status` to verify both: (True, "") — `brv status` exits 0. (False, "sdk_missing") — brv CLI not on PATH. (False, "auth: ...") — non-zero exit + stderr mentions auth / login / unauthorized / "not signed in" / 401 / 403. (False, "unreachable: timeout")— probe timed out. (False, "unreachable: ...") — anything else (config issue, malformed install, etc.). The probe has a 3-second timeout — `brv status` is local-cache backed, so anything longer indicates a stuck CLI. The "never raises" contract from the ABC default is preserved — even if _run_brv itself propagates (it shouldn't; it has its own try/except), the override catches the exception and returns (False, "unreachable: <msg>"). Tests: - success returns (True, "") - sdk_missing when brv absent - 5 auth-classification cases (not-signed-in, Unauthorized, auth-required, 401, login required) - "timed out" → unreachable: timeout - generic non-zero exit → unreachable: <msg> - never-raises: even when _run_brv throws, returns tuple - long error messages truncated under 220 chars Continues #42; does not close it. Three providers migrated: mem0 (#44), honcho (#45), byterover (this PR). Six remain before the doctor-branch collapse in step 3.
3 tasks
PowerCreek
added a commit
that referenced
this pull request
May 23, 2026
…s 2h + 3) (#52) Closes #42. Final two steps of the RFC migration. Step 2h — holographic health_check: Holographic is local-only (SQLite); no remote service. But a read-only HERMES_HOME (RO mount, wrong perms) would still break it at runtime. Override verifies db_path's parent directory is writable. (True, "") — parent mkdir + W_OK pass. (False, "unreachable: <msg>") — mkdir failed or W_OK denied. (False, "config_error: <msg>") — get_hermes_home raised. Step 3 — doctor branch collapse: Every shipped provider now implements health_check() with the RFC #42 reason-prefix taxonomy. The provider-specific Honcho + Mem0 elif blocks (~120 lines combined) are gone. The unified dispatch is one ~80-line block in doctor.py keyed on the prefix: healthy → check_ok("<name> reachable") no_api_key / no_credentials → _fail_and_issue (setup hint) no_url / no_endpoint → _fail_and_issue (URL hint) no_config → check_warn (run setup) disabled → check_info sdk_missing → _fail_and_issue (plugin docs) auth: → _fail_and_issue (rotate key) not_found: → _fail_and_issue (verify base URL) http: → check_warn (unexpected status) unreachable: → check_warn (transient hint) config_error: → check_warn (config raised) health_check_raised: → check_warn (provider bug; RFC #42 says health_check MUST NOT raise, so this is a contract violation worth flagging) other → check_warn (unknown verbatim) Doctor's Memory Provider section is now ~80 lines instead of ~200, and adding the 9th provider requires zero doctor changes. Migration table (8 providers total): mem0 — PR #44 GET /v1/memory profile honcho — PR #45 get_honcho_client handshake byterover — PR #46 brv status (CLI + login) supermemory — PR #48 client.profile probe openviking — PR #49 /health endpoint probe retaindb — PR #50 /v1/memory/profile GET hindsight — PR #51 mode-dependent (local import / cloud /version) holographic — this PR db_path parent writability Tests: - 6 new tests for holographic health_check - Updated 3 doctor tests to assert the new unified dispatch (auth → _fail_and_issue; unreachable → check_warn; healthy → "<name> reachable") instead of the now-removed elif-block output strings. - 228 health_check + doctor tests pass total. The #42 RFC is now fully implemented across all shipped memory providers. Closing the issue with this PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Step 2a of the RFC migration plan in #42. Moves Mem0's probe from doctor's
elifblock (PR #35) INTO the provider's ownhealth_check()override, so doctor's generic branch can eventually use the unified ABC interface (step 3).Behavior
Functionally identical to PR #35's doctor block, but returns the RFC-conventioned tuple:
get_all(user_id, limit=1)succeeds(True, "")MEM0_API_KEYnot set(False, "no_api_key")mem0ainot installed(False, "sdk_missing")(False, "auth: <msg>")(False, "unreachable: <msg>")_load_config()itself raises(False, "config_error: <msg>")The ABC "never raises" contract is preserved end-to-end — even when
_load_configthrows, the method returns a tuple.What this PR does NOT do
Doesn't collapse the dedicated
mem0elif block inhermes_cli/doctor.pyyet. That comes in a follow-up PR (step 3 of #42) once Honcho also migrates. Operators see no doctor behavior change; the new method is just an alternative entry point with the same logic.Test plan
tests/plugins/memory/test_mem0_health_check.py:(True, "")no_api_keywhenMEM0_API_KEYemptysdk_missingwhenmem0module not importableConnectionError→unreachable:_load_configthrows,health_checkreturns(False, "config_error: ...")pytest tests/plugins/memory/test_mem0_health_check.py tests/plugins/memory/test_mem0_v2.py tests/agent/test_memory_provider.py tests/hermes_cli/test_doctor_mem0_probe.py→ 98 passed (91 existing + 7 new).Continues #42; step 2b will migrate Honcho.
Filed by hermes-maintainer (PowerCreek).