Skip to content

Bug: session_search drops child/continuation-session hits by summarizing only the lineage root #6507

@markusleben

Description

@markusleben

Summary

This issue report was prepared from a Hermes Agent investigation and local reproduction against a real state.db lineage in a Telegram session chain.

session_search can find a match in a child/continuation session, but then resolve that hit to the lineage root and summarize only the root session’s direct messages.

This causes recall to miss the actual matched content whenever the query exists only in a descendant session and not in the root session itself.

What is confirmed

  1. search_messages() returns the actual message-owning session_id

    • The raw search hit correctly points to the child session that contains the matched text.
  2. session_search then resolves that child session_id to the root parent session

    • This happens during lineage dedup/grouping.
  3. The summarization/preparation step loads only the resolved root session transcript

    • It does not load the child transcript that actually contained the hit.
    • It also does not merge the full lineage before summarization.
  4. If the query is absent from the root transcript, the summary/preview is generated from the wrong conversation slice

    • In practice this can show unrelated early root-session content instead of the matched child-session content.

Root cause

The bug appears to be in the handoff between “find matching session/message” and “prepare transcript for summarization.”

Current flow:

  • search_messages() returns a hit with the real child session_id
  • _resolve_to_parent() walks that session up to the lineage root
  • session_search replaces the hit session_id with the resolved root session_id
  • db.get_messages_as_conversation(session_id) is then called on the resolved root only

That conflates two different concepts:

  • grouping/deduplicating a lineage for display
  • loading the transcript that actually contains the hit

Those are not equivalent once sessions are split across continuations/compression chains/delegation children.

Code references

tools/session_search_tool.py

  • lines 296–321: _resolve_to_parent()
  • lines 331–343: child hit is rewritten to resolved parent/root session_id
  • lines 349–357: summarization input is loaded via db.get_messages_as_conversation(session_id), where session_id is already the resolved root

hermes_state.py

  • lines 957–963: get_messages_as_conversation() loads messages for exactly one session_id only
    • It does not expand to descendants or merge a lineage.

Additional contributing behavior

tools/session_search_tool.py

  • lines 89–122: _truncate_around_matches()

If the query is not present in the selected transcript, truncation falls back to the beginning of that transcript. So when the wrong root transcript is loaded, the returned preview/summary can start at the beginning of the root session instead of near the actual child hit.

Expected behavior

If a search hit is found in a child/continuation session, session_search should summarize a transcript that actually includes the matched content.

That could be done in one of two valid ways:

Option A:

  • summarize the hit-bearing child session transcript

Option B:

  • summarize a lineage-aware merged transcript (root + descendants in chronological order, or at least the relevant branch containing the hit)

But the current behavior of:

  • resolving to root for grouping
  • then loading only the root transcript
    is incorrect.

Actual behavior

session_search can return a result for the lineage root even when the query exists only in a descendant child session, and the resulting summary/preview does not include the matched child-specific content.

Impact

  • recall can be factually misleading
  • users can be shown irrelevant earlier root-session content
  • compression/continuation chains become partially invisible to session_search even though the underlying FTS hit was found correctly
  • the fallback preview path is also affected, so this is not only an auxiliary-summarizer issue

Minimal reproduction shape

This can be reproduced with a mocked DB setup where:

  • search_messages() returns one hit from child_sid for query Q
  • get_session(child_sid) resolves upward to root_sid
  • get_messages_as_conversation(root_sid) returns transcript text that does not contain Q
  • get_messages_as_conversation(child_sid) would contain Q

Current behavior:

  • session_search returns/summarizes root_sid
  • the text used for summary/preview does not contain Q

Suggested fix directions (not yet validated)

These are candidate fix directions only. They are not claimed as confirmed fixes.

  1. Preserve both IDs internally
  • hit_session_id = raw match owner
  • display_session_id = resolved lineage root
  1. Load the correct transcript for summarization
    Either:
  • load the hit-bearing child transcript
    or
  • load a lineage-aware merged transcript containing the hit-bearing session
  1. Keep metadata internally consistent
    Right now the result can mix:
  • resolved root session_id
    with
  • metadata originating from the child hit row

Regression tests needed

Please add a positive lineage retrieval test covering:

  • search_messages() returns a hit in child_sid
  • child_sid resolves to root_sid
  • root transcript does not contain the query
  • child transcript does contain the query
  • session_search must return a summary/preview that includes the child hit text

Also worth adding:

  • a metadata consistency test for root-vs-child result fields
  • a lineage-merged transcript test if that is the chosen fix

Related but not duplicate

Verification notes

This was confirmed locally against:

  • a real multi-session Telegram lineage in state.db
  • the current implementation in tools/session_search_tool.py
  • the current single-session transcript loader in hermes_state.py
  • existing tests in tests/tools/test_session_search.py, which currently do not cover this positive lineage-retrieval case

The suggested fix directions above are hypotheses based on the confirmed reproduction and code path. Their effectiveness has not yet been validated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt buildertool/memoryMemory tool and memory providerstype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions