Summary
This issue report was prepared from a Hermes Agent investigation and local reproduction against a real state.db lineage in a Telegram session chain.
session_search can find a match in a child/continuation session, but then resolve that hit to the lineage root and summarize only the root session’s direct messages.
This causes recall to miss the actual matched content whenever the query exists only in a descendant session and not in the root session itself.
What is confirmed
-
search_messages() returns the actual message-owning session_id
- The raw search hit correctly points to the child session that contains the matched text.
-
session_search then resolves that child session_id to the root parent session
- This happens during lineage dedup/grouping.
-
The summarization/preparation step loads only the resolved root session transcript
- It does not load the child transcript that actually contained the hit.
- It also does not merge the full lineage before summarization.
-
If the query is absent from the root transcript, the summary/preview is generated from the wrong conversation slice
- In practice this can show unrelated early root-session content instead of the matched child-session content.
Root cause
The bug appears to be in the handoff between “find matching session/message” and “prepare transcript for summarization.”
Current flow:
search_messages() returns a hit with the real child session_id
_resolve_to_parent() walks that session up to the lineage root
session_search replaces the hit session_id with the resolved root session_id
db.get_messages_as_conversation(session_id) is then called on the resolved root only
That conflates two different concepts:
- grouping/deduplicating a lineage for display
- loading the transcript that actually contains the hit
Those are not equivalent once sessions are split across continuations/compression chains/delegation children.
Code references
tools/session_search_tool.py
- lines 296–321:
_resolve_to_parent()
- lines 331–343: child hit is rewritten to resolved parent/root
session_id
- lines 349–357: summarization input is loaded via
db.get_messages_as_conversation(session_id), where session_id is already the resolved root
hermes_state.py
- lines 957–963:
get_messages_as_conversation() loads messages for exactly one session_id only
- It does not expand to descendants or merge a lineage.
Additional contributing behavior
tools/session_search_tool.py
- lines 89–122:
_truncate_around_matches()
If the query is not present in the selected transcript, truncation falls back to the beginning of that transcript. So when the wrong root transcript is loaded, the returned preview/summary can start at the beginning of the root session instead of near the actual child hit.
Expected behavior
If a search hit is found in a child/continuation session, session_search should summarize a transcript that actually includes the matched content.
That could be done in one of two valid ways:
Option A:
- summarize the hit-bearing child session transcript
Option B:
- summarize a lineage-aware merged transcript (root + descendants in chronological order, or at least the relevant branch containing the hit)
But the current behavior of:
- resolving to root for grouping
- then loading only the root transcript
is incorrect.
Actual behavior
session_search can return a result for the lineage root even when the query exists only in a descendant child session, and the resulting summary/preview does not include the matched child-specific content.
Impact
- recall can be factually misleading
- users can be shown irrelevant earlier root-session content
- compression/continuation chains become partially invisible to
session_search even though the underlying FTS hit was found correctly
- the fallback preview path is also affected, so this is not only an auxiliary-summarizer issue
Minimal reproduction shape
This can be reproduced with a mocked DB setup where:
search_messages() returns one hit from child_sid for query Q
get_session(child_sid) resolves upward to root_sid
get_messages_as_conversation(root_sid) returns transcript text that does not contain Q
get_messages_as_conversation(child_sid) would contain Q
Current behavior:
session_search returns/summarizes root_sid
- the text used for summary/preview does not contain
Q
Suggested fix directions (not yet validated)
These are candidate fix directions only. They are not claimed as confirmed fixes.
- Preserve both IDs internally
hit_session_id = raw match owner
display_session_id = resolved lineage root
- Load the correct transcript for summarization
Either:
- load the hit-bearing child transcript
or
- load a lineage-aware merged transcript containing the hit-bearing session
- Keep metadata internally consistent
Right now the result can mix:
- resolved root
session_id
with
- metadata originating from the child hit row
Regression tests needed
Please add a positive lineage retrieval test covering:
search_messages() returns a hit in child_sid
child_sid resolves to root_sid
- root transcript does not contain the query
- child transcript does contain the query
session_search must return a summary/preview that includes the child hit text
Also worth adding:
- a metadata consistency test for root-vs-child result fields
- a lineage-merged transcript test if that is the chosen fix
Related but not duplicate
Verification notes
This was confirmed locally against:
- a real multi-session Telegram lineage in
state.db
- the current implementation in
tools/session_search_tool.py
- the current single-session transcript loader in
hermes_state.py
- existing tests in
tests/tools/test_session_search.py, which currently do not cover this positive lineage-retrieval case
The suggested fix directions above are hypotheses based on the confirmed reproduction and code path. Their effectiveness has not yet been validated.
Summary
This issue report was prepared from a Hermes Agent investigation and local reproduction against a real
state.dblineage in a Telegram session chain.session_searchcan find a match in a child/continuation session, but then resolve that hit to the lineage root and summarize only the root session’s direct messages.This causes recall to miss the actual matched content whenever the query exists only in a descendant session and not in the root session itself.
What is confirmed
search_messages()returns the actual message-owningsession_idsession_searchthen resolves that childsession_idto the root parent sessionThe summarization/preparation step loads only the resolved root session transcript
If the query is absent from the root transcript, the summary/preview is generated from the wrong conversation slice
Root cause
The bug appears to be in the handoff between “find matching session/message” and “prepare transcript for summarization.”
Current flow:
search_messages()returns a hit with the real childsession_id_resolve_to_parent()walks that session up to the lineage rootsession_searchreplaces the hitsession_idwith the resolved rootsession_iddb.get_messages_as_conversation(session_id)is then called on the resolved root onlyThat conflates two different concepts:
Those are not equivalent once sessions are split across continuations/compression chains/delegation children.
Code references
tools/session_search_tool.py_resolve_to_parent()session_iddb.get_messages_as_conversation(session_id), wheresession_idis already the resolved roothermes_state.pyget_messages_as_conversation()loads messages for exactly onesession_idonlyAdditional contributing behavior
tools/session_search_tool.py_truncate_around_matches()If the query is not present in the selected transcript, truncation falls back to the beginning of that transcript. So when the wrong root transcript is loaded, the returned preview/summary can start at the beginning of the root session instead of near the actual child hit.
Expected behavior
If a search hit is found in a child/continuation session,
session_searchshould summarize a transcript that actually includes the matched content.That could be done in one of two valid ways:
Option A:
Option B:
But the current behavior of:
is incorrect.
Actual behavior
session_searchcan return a result for the lineage root even when the query exists only in a descendant child session, and the resulting summary/preview does not include the matched child-specific content.Impact
session_searcheven though the underlying FTS hit was found correctlyMinimal reproduction shape
This can be reproduced with a mocked DB setup where:
search_messages()returns one hit fromchild_sidfor queryQget_session(child_sid)resolves upward toroot_sidget_messages_as_conversation(root_sid)returns transcript text that does not containQget_messages_as_conversation(child_sid)would containQCurrent behavior:
session_searchreturns/summarizesroot_sidQSuggested fix directions (not yet validated)
These are candidate fix directions only. They are not claimed as confirmed fixes.
hit_session_id= raw match ownerdisplay_session_id= resolved lineage rootEither:
or
Right now the result can mix:
session_idwith
Regression tests needed
Please add a positive lineage retrieval test covering:
search_messages()returns a hit inchild_sidchild_sidresolves toroot_sidsession_searchmust return a summary/preview that includes the child hit textAlso worth adding:
Related but not duplicate
session_searchsummarization path, but about empty results on summarization failureVerification notes
This was confirmed locally against:
state.dbtools/session_search_tool.pyhermes_state.pytests/tools/test_session_search.py, which currently do not cover this positive lineage-retrieval caseThe suggested fix directions above are hypotheses based on the confirmed reproduction and code path. Their effectiveness has not yet been validated.