Skip to content

refactor(session-log): stop writing per-session JSON snapshots#29182

Closed
yoniebans wants to merge 6 commits into
mainfrom
refactor/stop-writing-session-json-snapshots
Closed

refactor(session-log): stop writing per-session JSON snapshots#29182
yoniebans wants to merge 6 commits into
mainfrom
refactor/stop-writing-session-json-snapshots

Conversation

@yoniebans

@yoniebans yoniebans commented May 20, 2026

Copy link
Copy Markdown
Collaborator

What changed

Stop writing per-session JSON snapshot files (~/.hermes/sessions/session_{sid}.json). state.db is the canonical store for all session messages and has been for several releases.

  • Delete _save_session_log() from run_agent.py and all 7 call-sites in conversation_loop.py / run_agent.py
  • Drop the session_log_file attribute from agent init (keep logs_dir — still used by request_dump_*.json debug breadcrumbs)
  • Drop the /branch and /compress re-point logic that redirected session_log_file to new session IDs
  • Remove ~13 test stubs that suppressed _save_session_log file I/O
  • Update CONTRIBUTING.md and the bundled hermes-agent skill to reflect state.db as canonical

14 files changed, +13 / -146 — almost entirely deletions.

Why it changed

The agent rewrote a per-session JSON snapshot on every turn boundary. With state.db canonical, that snapshot was a parallel store with no consumer outside its own self-protection guard ("don't overwrite a larger log with fewer messages"). On a typical profile, these snapshots accounted for ~950 files / ~500MB in ~/.hermes/sessions/.

Audit confirmed:

  • The only reader of the snapshot was the writer itself (the read-back guard)
  • Not consumed by trajectory.py, batch_runner.py, mcp_serve.py, the data pipeline, or any plugin
  • state.db messages table stores every field the snapshot stored, plus per-message timestamps and token counts the snapshot lacked

What's NOT changed

  • Existing session_*.json files on disk are left untouched. No destructive cleanup. Users can purge at their discretion.
  • request_dump_*.json — debug breadcrumb written on API errors. Different artifact, different writer (agent_runtime_helpers.py), still uses agent.logs_dir. Unchanged.
  • sessions.json — gateway routing index with live readers. Separate effort.
  • Gateway *.jsonl transcripts — addressed in companion PR refactor(gateway): stop writing JSONL transcripts #29211.

How to test

Automated:

scripts/run_tests.sh

Manual smoke test:

python -c "from run_agent import AIAgent; a = AIAgent(quiet_mode=True); print(a.chat('reply with ok'))"
ls ~/.hermes/sessions/session_*.json  # no new file for the session that just ran
sqlite3 ~/.hermes/state.db "SELECT id, message_count FROM sessions ORDER BY started_at DESC LIMIT 1"  # session exists in DB

Platform tested

  • Linux (Ubuntu, x86_64)

Validation

  • Smoke test: one-shot CLI turn produces no new session_*.json
  • Full test suite: 24,571 passed, 2 failed (both pre-existing on main), 67 skipped

yoniebans added 4 commits May 20, 2026 09:18
state.db now stores every message field the JSON snapshot stored. Removed
the method, all 7 call-sites, and ~13 test stubs that suppressed its file I/O.
Body is in git history if it ever needs to come back.
The attribute no longer exists; nothing to re-point.
@github-actions

github-actions Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

🔎 Lint report: refactor/stop-writing-session-json-snapshots vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8953 on HEAD, 8974 on base (✅ -21)

🆕 New issues: none

✅ Fixed issues (14):

Rule Count
unresolved-attribute 8
invalid-assignment 4
unsupported-operator 1
invalid-argument-type 1
First entries
run_agent.py:1562: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `session_id`
run_agent.py:1582: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `verbose_logging`
tests/run_agent/test_context_token_tracking.py:55: [invalid-assignment] invalid-assignment: Object of type `(...) -> None` is not assignable to attribute `_save_session_log` of type `def _save_session_log(self, messages: list[dict[str, Any]] = None) -> Unknown`
run_agent.py:1565: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `platform`
tests/cron/test_codex_execution_paths.py:77: [invalid-assignment] invalid-assignment: Object of type `(messages) -> None` is not assignable to attribute `_save_session_log` of type `def _save_session_log(self, messages: list[dict[str, Any]] = None) -> Unknown`
cli.py:6507: [invalid-assignment] invalid-assignment: Object of type `Unknown` is not assignable to attribute `session_log_file` on type `AIAgent & <Protocol with members 'session_log_file'> & <Protocol with members 'logs_dir'> & ~AlwaysFalsy`
run_agent.py:1568: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `_cached_system_prompt`
cli.py:6508: [unsupported-operator] unsupported-operator: Operator `/` is not supported between objects of type `object` and `str`
run_agent.py:1575: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `session_log_file`
tests/run_agent/test_run_agent.py:559: [invalid-argument-type] invalid-argument-type: Argument to function `AIAgent._clean_session_content` is incorrect: Expected `str`, found `None`
run_agent.py:1566: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `session_start`
run_agent.py:1563: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `model`
tests/run_agent/test_run_agent_codex_responses.py:597: [invalid-assignment] invalid-assignment: Object of type `(messages) -> None` is not assignable to attribute `_save_session_log` of type `def _save_session_log(self, messages: list[dict[str, Any]] = None) -> Unknown`
run_agent.py:1569: [unresolved-attribute] unresolved-attribute: Object of type `Self@_save_session_log` has no attribute `tools`

Unchanged: 4717 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Only caller was the removed _save_session_log. Also removes the unused
convert_scratchpad_to_think and has_incomplete_scratchpad imports from
run_agent.py (both still used elsewhere via their own imports).
@alt-glitch alt-glitch added type/refactor Code restructuring, no behavior change P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder labels May 20, 2026
…tespace

Adds TestNoSessionJsonSnapshot to lock the contract that session_log_file
attribute, _save_session_log method, and the per-session JSON snapshot
writer are gone. logs_dir is retained for request_dump_*.json.

Also cleans up stray trailing whitespace in test_run_agent_codex_responses
introduced when the _save_session_log stub line was deleted.
teknium1 added a commit that referenced this pull request May 20, 2026
The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id
5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle,
id 687076, who commits as emozilla@nousresearch.com).  Verified across
all 60 historical commits on the repo authored from this email — every
one of them was a yoniebans commit being mis-credited to jquesnelle in
the changelog.

Surfaced while salvaging PR #29182 (yoniebans's session-log refactor).
teknium1 added a commit that referenced this pull request May 20, 2026
PR #29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer.  Some
users have external tooling that reads `~/.hermes/sessions/session_{sid}.json`
directly, so reintroduce the writer behind a config flag that defaults
to off.

- Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG
- Restore `AIAgent._save_session_log` + `_clean_session_content` as
  gated methods.  When the flag is off the call is a fast no-op; when
  on, the writer behaves as before (atomic write, truncation guard
  preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from `agent.session_id` on each call so
  `/branch` and `/compress` re-points happen automatically — no need
  to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in `_persist_session` (the cleanup-on-exit
  hook).  Did NOT restore the 7 intra-turn calls the original PR deleted
  — those were redundant writes within the same turn that doubled disk
  I/O without adding any persistence guarantee `_persist_session` does
  not already provide
- Read the flag once at agent init via `load_config()`, cache as
  `agent._session_json_enabled`
- Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn`
  to pin behavior: default off (no file), opt-in true (file written),
  no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled `hermes-agent` skill to
  document the flag and its default
@teknium1

Copy link
Copy Markdown
Contributor

Salvaged via PR #29278 (#29278). Your six commits were cherry-picked onto current main with authorship preserved (rebase-merge), so each commit still shows up under your name in git log.

Only adjustment: the snapshot writer is now opt-in instead of removed outright. sessions.write_json_snapshots (default False) gates a restored _save_session_log so anyone with external tooling that consumes ~/.hermes/sessions/session_{sid}.json can re-enable it. The default behavior matches your PR — no more 6GB+ sessions/ directories on heavy users.

Also fixed an AUTHOR_MAP entry while we were here: jonny@nousresearch.com was pointing at @jquesnelle, but that's your email — corrected to point at @yoniebans so changelogs credit you correctly going forward.

Thanks for the cleanup!

Lillard01 pushed a commit to Lillard01/hermes-agent that referenced this pull request May 21, 2026
The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id
5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle,
id 687076, who commits as emozilla@nousresearch.com).  Verified across
all 60 historical commits on the repo authored from this email — every
one of them was a yoniebans commit being mis-credited to jquesnelle in
the changelog.

Surfaced while salvaging PR NousResearch#29182 (yoniebans's session-log refactor).
Lillard01 pushed a commit to Lillard01/hermes-agent that referenced this pull request May 21, 2026
PR NousResearch#29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer.  Some
users have external tooling that reads `~/.hermes/sessions/session_{sid}.json`
directly, so reintroduce the writer behind a config flag that defaults
to off.

- Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG
- Restore `AIAgent._save_session_log` + `_clean_session_content` as
  gated methods.  When the flag is off the call is a fast no-op; when
  on, the writer behaves as before (atomic write, truncation guard
  preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from `agent.session_id` on each call so
  `/branch` and `/compress` re-points happen automatically — no need
  to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in `_persist_session` (the cleanup-on-exit
  hook).  Did NOT restore the 7 intra-turn calls the original PR deleted
  — those were redundant writes within the same turn that doubled disk
  I/O without adding any persistence guarantee `_persist_session` does
  not already provide
- Read the flag once at agent init via `load_config()`, cache as
  `agent._session_json_enabled`
- Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn`
  to pin behavior: default off (no file), opt-in true (file written),
  no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled `hermes-agent` skill to
  document the flag and its default
Gpapas pushed a commit to Gpapas/hermes-agent that referenced this pull request May 23, 2026
The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id
5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle,
id 687076, who commits as emozilla@nousresearch.com).  Verified across
all 60 historical commits on the repo authored from this email — every
one of them was a yoniebans commit being mis-credited to jquesnelle in
the changelog.

Surfaced while salvaging PR NousResearch#29182 (yoniebans's session-log refactor).
Gpapas pushed a commit to Gpapas/hermes-agent that referenced this pull request May 23, 2026
PR NousResearch#29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer.  Some
users have external tooling that reads `~/.hermes/sessions/session_{sid}.json`
directly, so reintroduce the writer behind a config flag that defaults
to off.

- Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG
- Restore `AIAgent._save_session_log` + `_clean_session_content` as
  gated methods.  When the flag is off the call is a fast no-op; when
  on, the writer behaves as before (atomic write, truncation guard
  preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from `agent.session_id` on each call so
  `/branch` and `/compress` re-points happen automatically — no need
  to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in `_persist_session` (the cleanup-on-exit
  hook).  Did NOT restore the 7 intra-turn calls the original PR deleted
  — those were redundant writes within the same turn that doubled disk
  I/O without adding any persistence guarantee `_persist_session` does
  not already provide
- Read the flag once at agent init via `load_config()`, cache as
  `agent._session_json_enabled`
- Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn`
  to pin behavior: default off (no file), opt-in true (file written),
  no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled `hermes-agent` skill to
  document the flag and its default
Mucky010 pushed a commit to Mucky010/hermes-agent that referenced this pull request May 24, 2026
The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id
5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle,
id 687076, who commits as emozilla@nousresearch.com).  Verified across
all 60 historical commits on the repo authored from this email — every
one of them was a yoniebans commit being mis-credited to jquesnelle in
the changelog.

Surfaced while salvaging PR NousResearch#29182 (yoniebans's session-log refactor).
Mucky010 pushed a commit to Mucky010/hermes-agent that referenced this pull request May 24, 2026
PR NousResearch#29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer.  Some
users have external tooling that reads `~/.hermes/sessions/session_{sid}.json`
directly, so reintroduce the writer behind a config flag that defaults
to off.

- Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG
- Restore `AIAgent._save_session_log` + `_clean_session_content` as
  gated methods.  When the flag is off the call is a fast no-op; when
  on, the writer behaves as before (atomic write, truncation guard
  preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from `agent.session_id` on each call so
  `/branch` and `/compress` re-points happen automatically — no need
  to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in `_persist_session` (the cleanup-on-exit
  hook).  Did NOT restore the 7 intra-turn calls the original PR deleted
  — those were redundant writes within the same turn that doubled disk
  I/O without adding any persistence guarantee `_persist_session` does
  not already provide
- Read the flag once at agent init via `load_config()`, cache as
  `agent._session_json_enabled`
- Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn`
  to pin behavior: default off (no file), opt-in true (file written),
  no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled `hermes-agent` skill to
  document the flag and its default
Bryce-huang pushed a commit to wbkunlun/hermes-agent that referenced this pull request May 29, 2026
The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id
5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle,
id 687076, who commits as emozilla@nousresearch.com).  Verified across
all 60 historical commits on the repo authored from this email — every
one of them was a yoniebans commit being mis-credited to jquesnelle in
the changelog.

Surfaced while salvaging PR NousResearch#29182 (yoniebans's session-log refactor).

#AI commit#
Bryce-huang pushed a commit to wbkunlun/hermes-agent that referenced this pull request May 29, 2026
PR NousResearch#29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer.  Some
users have external tooling that reads `~/.hermes/sessions/session_{sid}.json`
directly, so reintroduce the writer behind a config flag that defaults
to off.

- Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG
- Restore `AIAgent._save_session_log` + `_clean_session_content` as
  gated methods.  When the flag is off the call is a fast no-op; when
  on, the writer behaves as before (atomic write, truncation guard
  preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from `agent.session_id` on each call so
  `/branch` and `/compress` re-points happen automatically — no need
  to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in `_persist_session` (the cleanup-on-exit
  hook).  Did NOT restore the 7 intra-turn calls the original PR deleted
  — those were redundant writes within the same turn that doubled disk
  I/O without adding any persistence guarantee `_persist_session` does
  not already provide
- Read the flag once at agent init via `load_config()`, cache as
  `agent._session_json_enabled`
- Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn`
  to pin behavior: default off (no file), opt-in true (file written),
  no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled `hermes-agent` skill to
  document the flag and its default

#AI commit#
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
The email "jonny@nousresearch.com" belongs to @yoniebans (GitHub id
5584832, display name "jonny"), not to Jeffrey Quesnelle (@jquesnelle,
id 687076, who commits as emozilla@nousresearch.com).  Verified across
all 60 historical commits on the repo authored from this email — every
one of them was a yoniebans commit being mis-credited to jquesnelle in
the changelog.

Surfaced while salvaging PR NousResearch#29182 (yoniebans's session-log refactor).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
PR NousResearch#29182 deleted the per-session JSON snapshot writer outright because
state.db is canonical and the snapshots had no in-tree consumer.  Some
users have external tooling that reads `~/.hermes/sessions/session_{sid}.json`
directly, so reintroduce the writer behind a config flag that defaults
to off.

- Add `sessions.write_json_snapshots` (default False) to DEFAULT_CONFIG
- Restore `AIAgent._save_session_log` + `_clean_session_content` as
  gated methods.  When the flag is off the call is a fast no-op; when
  on, the writer behaves as before (atomic write, truncation guard
  preserved, REASONING_SCRATCHPAD → think tag normalization)
- Re-derive the target path from `agent.session_id` on each call so
  `/branch` and `/compress` re-points happen automatically — no need
  to restore the explicit re-point bookkeeping at call sites
- Wire the single call site in `_persist_session` (the cleanup-on-exit
  hook).  Did NOT restore the 7 intra-turn calls the original PR deleted
  — those were redundant writes within the same turn that doubled disk
  I/O without adding any persistence guarantee `_persist_session` does
  not already provide
- Read the flag once at agent init via `load_config()`, cache as
  `agent._session_json_enabled`
- Update `TestNoSessionJsonSnapshot` → `TestSessionJsonSnapshotOptIn`
  to pin behavior: default off (no file), opt-in true (file written),
  no-op method on default agents, logs_dir retained unconditionally
- Update CONTRIBUTING.md and the bundled `hermes-agent` skill to
  document the flag and its default
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have type/refactor Code restructuring, no behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants