fix: improve session_search lineage recall#6536
Conversation
- merge matched child/root hits within the same session lineage - anchor summaries with matched snippets and better fallback previews - add regression tests for lineage recall, metadata, and truncation
|
One small follow-up question / nice-to-have while I was validating this locally: I noticed
But the Telegram adapter itself appears to support MarkdownV2/fenced code blocks pretty explicitly ( That made me wonder whether the current Telegram “no markdown” hint is:
I didn’t include that change in this PR on purpose — this PR stays focused on But if the current Telegram hint is not intentional for security/robustness reasons, I think there may be a nice small follow-up improvement there: let Telegram shell commands / logs render as proper fenced code blocks instead of plain text blobs. Happy to open a tiny separate follow-up PR for that if useful — or ignore it if the current wording is deliberate. |
Summary
session_searchso it can retain matched child/continuation content within a resolved root lineageProblem
session_searchcould find a hit in a child/continuation session, resolve it to the lineage root, and then summarize only the root transcript. In practice that meant later continuation content could disappear from recall, especially for broader queries spanning both root and child sessions.This PR is an attempted fix for #6507 based on a local reproduction and iterative validation.
What changed
hit_session_idsper resolved root instead of only the first hit_truncate_around_matches()so quoted phrases are preferred over common-token anchoringValidation
Local validation performed:
python -m pytest tests/tools/test_session_search.py tests/tools/test_llm_content_none_guard.py -qstate.dbrepro for yesterday's Telegram session chainRelevant regression coverage added for:
Notes
Closes #6507