refactor(gateway): stop writing JSONL transcripts by yoniebans · Pull Request #29211 · NousResearch/hermes-agent

yoniebans · 2026-05-20T09:08:40Z

What changed

Stop dual-writing gateway transcripts to JSONL files alongside state.db. SQLite is now the sole store for gateway session messages.

Drop the "use whichever source is longer" JSONL fallback in load_transcript()
Remove JSONL writes from append_to_transcript(), rewrite_transcript(), and mirror._append_to_jsonl()
Delete get_transcript_path() — no remaining callers
Migrate yuanbao recall from direct JSONL file reads to load_transcript()
Add regression tests pinning DB-only transcript loading and yuanbao content-match recall
Update sessions.md docs to reflect state.db as canonical

12 files changed, +134 / -369.

Why it changed

The gateway dual-wrote every message to state.db AND to {session_id}.jsonl transcript files. The JSONL path was labelled "legacy, kept during transition" but the transition finished releases ago. The load_transcript() fallback ("use JSONL if it has more messages than the DB") was provably dead on real user databases — verified on a corpus of 27 JSONL files / 364 DB sessions where zero files would have triggered the fallback.

The dual-write added code complexity, disk I/O, and a second persistence path that made session storage harder to reason about.

Yuanbao recall degradation (known, documented)

Yuanbao's message-recall feature previously read the JSONL directly to find messages by platform message_id, a field state.db doesn't preserve. With JSONL gone, recall falls through from exact-id match (branch A1) to content-match (branch A2) or system-note (branch B) — both of which work DB-only.

This is a graceful degradation, not a break. The content-match path catches the common case; the system-note path is the existing explicit fallback for "not found in transcript."

What's NOT changed

Existing *.jsonl files on disk are left untouched. No destructive cleanup. Safe to delete after verifying the corresponding session exists in state.db.
sessions.json — gateway routing index with live readers (mcp_serve.py, channel_directory.py, hermes_cli/status.py). Carries fields state.db doesn't have (origin, expiry_finalized, auto_reset). Separate effort.
request_dump_*.json — debug breadcrumb. Unchanged.
Per-session JSON snapshots (session_*.json) — addressed in companion PR refactor(session-log): stop writing per-session JSON snapshots #29182.
hermes_state.py delete_session() — still does opportunistic cleanup of legacy .json/.jsonl files. Left as-is since old files exist on disk.

Prior art

PR #10560 attempted a similar change with a migration step (importing legacy JSONL into SQLite). We deliberately do not migrate — state.db has been canonical for many releases and on observable user disks zero JSONL files have data the DB doesn't. Migration code adds bug surface for a phantom cohort.

How to test

Automated:

scripts/run_tests.sh

Manual smoke test (gateway):

hermes gateway stop                    # stop systemd unit
HERMES_HOME=$(mktemp -d) hermes gateway run &  # run against scratch DB
# send a message to the Telegram bot, wait for reply
kill %1
ls $HERMES_HOME/sessions/*.jsonl       # should be empty — no JSONL written
sqlite3 $HERMES_HOME/state.db "SELECT role, substr(content,1,40) FROM messages ORDER BY id DESC LIMIT 4"
hermes gateway start                   # restart daily gateway

Manual smoke test (yuanbao recall):
The recall codepath is exercised when a Yuanbao user deletes a message. With this PR, recall uses load_transcript() (DB-only) instead of reading the JSONL directly. Branch A2 (content-match) and B (system-note) remain functional.

Platform tested

Linux (Ubuntu, x86_64)

Validation

New tests: test_load_transcript_db_only.py, test_yuanbao_recall_db_only.py
Gateway test suite: 3,125+ passed
Broader suite (run_agent + cli + cron + gateway): 8,150+ passed
One xdist-flaky test (test_load_transcript_db_only) fails intermittently under parallel xdist but passes consistently in isolation — pre-existing xdist race, not a regression

state.db is canonical. The 'use whichever source is longer' branch was defensive code for the pre-DB migration; on every real DB it has not fired (verified on a session corpus with 27 jsonl files / 950 sessions — zero jsonl-bigger cases). Test changes: - TestLoadTranscriptCorruptLines: deleted (tested dead JSONL code path) - TestLoadTranscriptPreferLongerSource: deleted (tested removed fallback) - Replaced with TestLoadTranscriptDBOnly (DB-only reads) - TestSessionStoreRewriteTranscript: fixture now creates DB session - test_gateway_retry_replaces_last_user_turn: fixture uses real DB

Yuanbao's recall feature was reading the gateway JSONL directly to look up messages by platform message_id, which state.db does not preserve. Migrated to use load_transcript() which returns DB messages. Recall branch A1 (message_id match) now falls through to A2 (content match) or B (system note) for all sessions — a documented degradation. Follow-up issue: add platform_message_id column to state.db messages to restore exact-id matching.

…te_transcript state.db is canonical. JSONL transcripts were a transition fallback; the fallback was removed in the previous commit. Existing *.jsonl files on disk are left untouched.

Mirror messages are persisted via _append_to_sqlite. JSONL writer was a redundant dual-write. Updated test assertions from JSONL file checks to SQLite mock verification.

…db writes Fixtures that instantiate SessionStore() trigger SessionDB() with no args, which resolves to ~/.hermes/state.db via the DEFAULT_DB_PATH module constant (snapshot of get_hermes_home() at hermes_state import time). The autouse _hermetic_environment fixture in tests/conftest.py monkeypatches HERMES_HOME env, but DEFAULT_DB_PATH is already cached by then. Per-test monkeypatch.setattr(hermes_state, 'DEFAULT_DB_PATH', tmp_path/'state.db') forces the DB into tmp_path so the tests can't leak into the real profile. Verified by counting u1-prefixed sessions in real state.db before/after: delta=0.

github-actions · 2026-05-20T09:09:19Z

🔎 Lint report: `refactor/stop-writing-gateway-jsonl-transcripts` vs `origin/main`

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8981 on HEAD, 8974 on base (🆕 +7)

🆕 New issues (6):

Rule	Count
`unresolved-attribute`	5
`unresolved-import`	1

First entries

tests/gateway/test_load_transcript_db_only.py:25: [unresolved-attribute] unresolved-attribute: Attribute `create_session` is not defined on `None` in union `None | SessionDB`
tests/gateway/test_session.py:565: [unresolved-attribute] unresolved-attribute: Attribute `create_session` is not defined on `None` in union `None | SessionDB`
tests/gateway/test_session.py:567: [unresolved-attribute] unresolved-attribute: Attribute `append_message` is not defined on `None` in union `None | SessionDB`
tests/gateway/test_load_transcript_db_only.py:4: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/gateway/platforms/test_yuanbao_recall_db_only.py:19: [unresolved-attribute] unresolved-attribute: Attribute `create_session` is not defined on `None` in union `None | SessionDB`
tests/gateway/test_retry_replacement.py:24: [unresolved-attribute] unresolved-attribute: Attribute `create_session` is not defined on `None` in union `None | SessionDB`

✅ Fixed issues: none

Unchanged: 4731 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

…fixture PR #29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness.

@bot

… recall PR #29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through)

teknium1 · 2026-05-20T20:01:10Z

Salvaged via PR #29449 (#29449). All eight of your commits were cherry-picked onto current main with authorship preserved (rebase-merge), so each commit still shows up under your name in git log.

The graceful-degradation note you flagged for yuanbao recall is now resolved: state.db got a platform_message_id column (schema v11 → v12, added via the declarative reconciler so no migration block needed), with a partial index for the recall point-lookup. Branch A1 (exact id match) is canonical again — A2 (content match) and B (system-note) remain as fallbacks for rows that lack a platform id (e.g. agent-processed @bot messages, pre-column legacy rows). Both A-branches log which one fired so it's visible in gateway.log.

Thanks for the cleanup — really clean PR, and the 5ea4cec6c follow-up that caught the DEFAULT_DB_PATH leak was a great find.

…fixture PR NousResearch#29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness.

@bot

… recall PR NousResearch#29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through)

…fixture PR NousResearch#29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness.

@bot

… recall PR NousResearch#29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through)

…fixture PR NousResearch#29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness.

@bot

… recall PR NousResearch#29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through)

…fixture PR NousResearch#29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness. #AI commit#

@bot

… recall PR NousResearch#29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through) #AI commit#

…fixture PR NousResearch#29211 review findings: 1. test_retry_replacement: pin DEFAULT_DB_PATH so SessionDB() doesn't write to the real ~/.hermes/state.db. Same fix as the other DB-only fixtures. 2. yuanbao recall branch A1 (message_id exact match) was structurally dead once load_transcript() became DB-only — state.db never preserves the platform message_id. Removed the dead loop, consolidated to a single content-match branch (renamed 'A: content match'). Branch B (system note) unchanged. Updated the test name + docstring to reflect this. Note: self._lock is no longer taken in append_to_transcript (was guarding the JSONL file append). SQLite append_message handles its own concurrency via WAL mode, so this is safe; flagging for awareness.

@bot

… recall PR NousResearch#29211 dropped JSONL gateway transcripts and noted that the platform's own `message_id` field (used by Yuanbao's recall guard to redact a message by exact platform id) was no longer preserved — falling back to content-match. That fallback works for the common case but redacts the wrong row when two messages share text (or fails to match when content is post-processed). Restore exact-id matching by giving state.db a column for it: - New `platform_message_id TEXT` column on the messages table (SCHEMA_VERSION bump 11 → 12; column added via declarative reconciler on existing DBs, no version-gated migration block needed) - Partial index `idx_messages_platform_msg_id` on (session_id, platform_message_id) to keep recall's point-lookup cheap even on large sessions - `append_message()` and `replace_messages()` accept the new value: the gateway-facing `append_to_transcript` in `gateway/session.py` forwards either `message["platform_message_id"]` or the legacy `message["message_id"]` key (yuanbao's existing convention) - `get_messages_as_conversation()` surfaces the column back on the message dict as `message_id` so platform code reads the same shape it used to read from JSONL - Yuanbao `_patch_transcript`: restore branch A1 (exact id match) ahead of A2 (content match) ahead of B (system-note). Both branches log which one fired so operators can tell from gateway.log whether recall hit the canonical path or had to fall back. Tests: - New low-level round-trip tests in `test_hermes_state.py` for both `append_message` and `replace_messages` paths - The PR's `test_yuanbao_recall_db_only.py` was rewritten to assert the new contract: branch A1 (id match) works against DB-only transcripts, and branch A2 (content match) still recovers rows that were observed without a platform id (e.g. agent-processed @bot messages where run.py doesn't carry msg_id through)

yoniebans added 7 commits May 20, 2026 09:13

test(gateway): pin SQLite-only load_transcript behaviour

68181bf

refactor(gateway): stop writing JSONL in append_to_transcript / rewri…

b80b03d

…te_transcript state.db is canonical. JSONL transcripts were a transition fallback; the fallback was removed in the previous commit. Existing *.jsonl files on disk are left untouched.

refactor(gateway): drop _append_to_jsonl from mirror

13ffc5d

Mirror messages are persisted via _append_to_sqlite. JSONL writer was a redundant dual-write. Updated test assertions from JSONL file checks to SQLite mock verification.

docs(sessions): state.db is canonical for gateway messages

f6de97f

yoniebans mentioned this pull request May 20, 2026

refactor(session-log): stop writing per-session JSON snapshots #29182

Closed

alt-glitch added type/refactor Code restructuring, no behavior change P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels May 20, 2026

teknium1 mentioned this pull request May 20, 2026

refactor(gateway): salvage #29211 + persist platform_message_id for recall #29449

Merged

teknium1 closed this in #29449 May 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(gateway): stop writing JSONL transcripts#29211

refactor(gateway): stop writing JSONL transcripts#29211
yoniebans wants to merge 8 commits into
mainfrom
refactor/stop-writing-gateway-jsonl-transcripts

yoniebans commented May 20, 2026

Uh oh!

github-actions Bot commented May 20, 2026 •

edited

Loading

Uh oh!

teknium1 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yoniebans commented May 20, 2026

What changed

Why it changed

Yuanbao recall degradation (known, documented)

What's NOT changed

Prior art

How to test

Platform tested

Validation

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔎 Lint report: refactor/stop-writing-gateway-jsonl-transcripts vs origin/main

ruff

ty (type checker)

Uh oh!

teknium1 commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented May 20, 2026 •

edited

Loading

🔎 Lint report: `refactor/stop-writing-gateway-jsonl-transcripts` vs `origin/main`