Skip to content

Session File Leak: JSON sessions never deleted, causing unbounded disk growth and cost explosion #3015

@rustyorb

Description

@rustyorb

Session File Leak: JSON Sessions Never Deleted, Causing Unbounded Disk Growth and Cost Explosion

Summary

Hermes Agent creates session JSON files for every conversation (chat, cron, subagent) but never deletes them. Over time, this leads to:

  • Hundreds/thousands of accumulated session files (observed: 728 files, 94MB on single machine)
  • Unbounded disk usage
  • Exacerbated token costs (large sessions with 500+ messages accumulate context forever)
  • 22M+ tokens burned in ~2 hours due to unchecked session growth

Root Cause Analysis

1. Session Reset Creates New Files, Never Deletes Old

In gateway/session.py, the reset_session() function:

  1. Creates a new SessionEntry with a new session_id
  2. Updates the in-memory mapping (_entries) to point to the new session
  3. Saves to session store
  4. Calls end_session() in SQLite database (marks as ended)
  5. Does NOT delete the old JSON file
# From session.py:776-800
def reset_session(self, session_key: str) -> Optional[SessionEntry]:
    # ... creates new session_id ...
    new_entry = SessionEntry(
        session_key=session_key,
        session_id=session_id,  # NEW ID
        # ...
    )
    self._entries[session_key] = new_entry  # Updates mapping
    self._save()
    # Creates NEW session file, old one remains on disk forever

2. Cron Sessions Multiply Rapidly

Each cron execution creates a new session with ID format: cron_{job_id}_{timestamp}. With crons running every 15-45 minutes:

  • 1 cron × 96 runs/day × 7 days = 672 session files per week per cron
  • Observed: 535 cron session files (73.5% of all sessions)

3. Session Files Are Never Cleaned Up

No deletion mechanism exists:

  • end_session() in hermes_state.py:252 only marks sessions as ended in SQLite (ended_at, end_reason)
  • No os.remove() call for session JSON files
  • No retention policy for filesystem cleanup
  • Sessions accumulate indefinitely

4. Large Sessions Retain Full Context

Session files contain complete message history. Observed:

  • Sessions with 500-600 messages
  • File sizes: 500KB-1MB each
  • All tool call results preserved forever
  • Context compression happens in-memory but original files remain

Impact

Quantified on SCANNINGPC (Deimos):

  • 728 session files accumulated
  • 94.33 MB total size
  • 535 cron sessions (73.5% of total)
  • 418 sessions aged 1-7 days
  • 245 sessions aged 6-24 hours

Token Cost Explosion

One session accumulated 602 messages in 107 seconds:

  • 209 tool call iterations
  • ~102K tokens
  • 5.61 messages/second processing rate
  • User sent 105 messages (frustrated rapid-fire interaction)
  • Assistant responded with 274 messages + 223 tool results

With no session cleanup, context grows exponentially until iteration limit hit.

Reproduction Steps

  1. Run Hermes gateway with any cron job (e.g., every 15 minutes)

  2. Let run for 24-48 hours

  3. Check ~/.hermes/sessions/:

    ls -la ~/.hermes/sessions/session_*.json | wc -l
    # Returns: 100+ files
  4. Observe files are never deleted, only new ones created

Evidence

Session Age Distribution

<1h:     9 files
1-6h:    54 files  
6-24h:   245 files
1-7d:    418 files
>7d:     2 files

Cron Session Dominance

  • Cron sessions: 535 (73.5%)
  • Chat sessions: 193 (26.5%)

Large Session Examples

  • session_20260325_112518_6995d1.json: 725 KB, 602 messages
  • session_20260325_110422_70aca0.json: 722 KB, 570 messages

Proposed Solutions

Immediate (Hotfix)

  1. Add session file cleanup on reset:

    # In reset_session(), before creating new entry:
    old_session_file = self._sessions_dir / f"{old_entry.session_id}.json"
    if old_session_file.exists():
        old_session_file.unlink()
  2. Add retention policy:

    • Delete session files older than 7 days
    • Keep SQLite records for history
    • Configurable retention period

Short-term

  1. Session size limits:

    • Max 1000 messages per session before forced reset
    • Max 10MB per session file
    • Alert when sessions grow too large
  2. Cron session optimization:

    • Option for cron sessions to not persist to JSON (SQLite only)
    • Automatic cleanup of cron sessions after delivery

Long-term

  1. Session archival:

    • Compress old sessions to .gz
    • Archive to separate directory
    • Background cleanup job
  2. Cost controls:

    • Per-session token budgets
    • Max iterations enforced strictly
    • Automatic session kill on threshold breach

Files Affected

  • gateway/session.py - Session lifecycle management
  • hermes_state.py - SQLite session tracking
  • run_agent.py - Session creation
  • cron/scheduler.py - Cron session creation

Environment

  • Hermes Agent version: Latest (as of 2026-03-25)
  • Installation: Source (git clone)
  • Platforms: Telegram, Discord, CLI, Cron
  • OS: Ubuntu (WSL2)

Labels

bug, performance, cost, storage, sessions, priority-critical


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions