Skip to content

hermes doctor: Mem0 check stops at API-key presence — no reachability probe (vs Honcho which connects) #34

@PowerCreek

Description

@PowerCreek

Context

Auditing memory-provider doctor checks for diagnostic-loudness gaps. Found a parallel-but-not-identical issue to #32: Honcho does a real connection probe (get_honcho_client(hcfg) raises on failure), Mem0 only checks cfg["api_key"] is non-empty (hermes_cli/doctor.py:2243):

mem0_key = mem0_cfg.get("api_key", "")
if mem0_key:
    check_ok("Mem0 API key configured")
    check_info(f"user_id={...} agent_id={...}")

So MEM0_API_KEY=garbage, MEM0_API_KEY= (expired), or MEM0_API_KEY= (network down) all produce the same green check. Operator's first inference request 401s in production and they don't connect that to doctor's clean run.

The Mem0 plugin already imports from mem0 import MemoryClient in _get_client() (plugins/memory/mem0/__init__.py:174). A MemoryClient(api_key=key).get_all(user_id=..., limit=1) round-trip validates auth + reachability.

Fix

Mirror the Honcho block: after confirming the key is present, attempt a one-shot MemoryClient(api_key=key).get_all(user_id=cfg["user_id"], limit=1). Three outcomes:

  • Success → check_ok("Mem0 connected", "user_id=… agent_id=…")
  • Auth failure (401/403 surface as an exception from mem0ai) → _fail_and_issue("Mem0 auth rejected", str(exc), "Mem0 API key rejected — verify MEM0_API_KEY at https://app.mem0.ai", issues)
  • Other exception (network, timeout, mem0ai SDK change) → check_warn("Mem0 probe failed", str(exc))

Wrap in try/except so a Mem0 SDK shape change can't crash doctor mid-run — surfaces as a warn row.

Out of scope

  • Probing other memory providers' generic is_available() more loudly. The MemoryProvider ABC method already returns False on missing config, which doctor surfaces with check_warn(f"{name} configured but not available"). Adding more depth there would require per-provider knowledge and is a separate scope decision.
  • Mem0-side circuit breaker integration. The plugin tracks _consecutive_failures for runtime backoff; doctor's check is a one-shot probe, not a long-running observer.

Filed by hermes-maintainer (PowerCreek). PR incoming.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions