Problem
The OpenVikingMemoryProvider performs a one-time health check during initialize(). If the OpenViking server is temporarily down (e.g. stale PID lock, container restart, port conflict), the provider sets self._client = None and never attempts to reconnect. All subsequent viking_search, viking_browse, viking_remember, etc. calls return:
{"error": "OpenViking server not connected"}
Even after the server comes back online and /health returns 200, the running Hermes session continues to fail permanently until the user starts a brand new conversation.
Reproduction Steps
- Ensure OpenViking server is stopped or unreachable.
- Start a Hermes conversation (CLI or gateway).
- Fix the OpenViking server (e.g.
docker compose up -d).
- Verify
curl http://localhost:1933/health returns 200.
- In the same Hermes session, invoke any Viking tool (e.g.
viking_browse or viking_search).
- Expected: Tool works. Actual:
"OpenViking server not connected".
Root Cause
In plugins/memory/openviking/__init__.py:
def initialize(self, session_id: str, **kwargs) -> None:
...
self._client = _VikingClient(self._endpoint, self._api_key)
if not self._client.health():
logger.warning("OpenViking server at %s is not reachable", self._endpoint)
self._client = None # <-- permanent disable
handle_tool_call then short-circuits on if not self._client: with no retry path:
def handle_tool_call(self, tool_name: str, args: dict, **kwargs) -> str:
if not self._client:
return json.dumps({"error": "OpenViking server not connected"})
Suggested Fixes
Option A: Lazy reconnect on first tool use (minimal)
Retry self._client.health() inside handle_tool_call when _client is None or when a request raises a connection error.
Option B: Background health-watch thread
Periodically ping /health and re-create _client when the server recovers.
Option C: Expose a /reconnect slash command or memory-manager API
Allow users to force re-initialization of memory providers without dropping the conversation.
Environment
- Hermes version: latest (
hermes-agent repo, plugins/memory/openviking/__init__.py)
- OpenViking version:
v0.3.3
- Platform: CLI (also affects gateway/long-lived sessions)
Problem
The
OpenVikingMemoryProviderperforms a one-time health check duringinitialize(). If the OpenViking server is temporarily down (e.g. stale PID lock, container restart, port conflict), the provider setsself._client = Noneand never attempts to reconnect. All subsequentviking_search,viking_browse,viking_remember, etc. calls return:{"error": "OpenViking server not connected"}Even after the server comes back online and
/healthreturns 200, the running Hermes session continues to fail permanently until the user starts a brand new conversation.Reproduction Steps
docker compose up -d).curl http://localhost:1933/healthreturns 200.viking_browseorviking_search)."OpenViking server not connected".Root Cause
In
plugins/memory/openviking/__init__.py:handle_tool_callthen short-circuits onif not self._client:with no retry path:Suggested Fixes
Option A: Lazy reconnect on first tool use (minimal)
Retry
self._client.health()insidehandle_tool_callwhen_clientisNoneor when a request raises a connection error.Option B: Background health-watch thread
Periodically ping
/healthand re-create_clientwhen the server recovers.Option C: Expose a
/reconnectslash command or memory-manager APIAllow users to force re-initialization of memory providers without dropping the conversation.
Environment
hermes-agentrepo,plugins/memory/openviking/__init__.py)v0.3.3