Skip to content

Honcho plugin: race condition in session cache causes silent conclusion/memory loss #5102

@darkworon

Description

@darkworon

Bug Description

create_conclusion() in the Honcho plugin intermittently fails with no error propagated to the user. The root cause is a race condition between the sync_turn background thread and the main thread accessing HonchoSessionManager._cache (a plain dict) without any synchronization.

Steps to Reproduce

  1. Enable Honcho memory plugin with writeFrequency: "async"
  2. Make rapid consecutive honcho_conclude tool calls (e.g., 10+ in a row)
  3. Some calls return "Failed to save conclusion" at random positions
  4. Retrying the exact same text succeeds (timing-dependent)
  5. Direct Honcho SDK calls work fine for the same content

Root Cause

_cache in session.py:94 is an unprotected dict accessed from multiple threads:

  • Main thread (tool calls): handle_tool_callcreate_conclusionself._cache.get(session_key) (session.py:916)
  • Background thread (sync_turn): spawned at __init__.py:569, calls self._manager.get_or_create(self._session_key) (line 559) which reads/writes _cache
  • Background thread (on_memory_write): spawned at __init__.py:589, calls create_conclusion which reads _cache

get_or_create() is not atomic — it checks membership, creates objects, then assigns to _cache (session.py:234-284). If the sync thread is mid-operation, the main thread's _cache.get() can miss the entry.

While CPython's GIL makes individual dict operations atomic, the multi-step read-modify-write in get_or_create() is not protected, leading to intermittent cache misses.

Impact

High — silent knowledge loss. Conclusions (user facts, preferences, corrections) are dropped without any indication to the user. This also affects:

  • on_memory_write() which mirrors built-in memory writes as conclusions (same race)
  • sync_turn() which records conversation turns
  • Any operation depending on _cache.get(session_key) succeeding

This undermines the core value proposition of the Honcho plugin as a long-term memory provider.

Proposed Fix

Add a threading.Lock to protect _cache access in HonchoSessionManager:

# session.py __init__
self._cache_lock = threading.Lock()

# All _cache reads/writes wrapped:
with self._cache_lock:
    if key in self._cache:
        return self._cache[key]

Alternatively, join the sync_thread before processing tool calls in handle_tool_call().

Environment

  • Hermes Agent v0.7.0
  • honcho-ai SDK (latest)
  • Gateway mode (Telegram platform)
  • writeFrequency: "async" in honcho.json

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3Low — cosmetic, nice to havecomp/pluginsPlugin system and bundled pluginstool/memoryMemory tool and memory providerstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions