-
-
Notifications
You must be signed in to change notification settings - Fork 79.1k
session.maintenance has no size cap for transcript .jsonl files — unbounded growth causes gateway CPU 100% #66360
Copy link
Copy link
Open
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:crash-loopCrash, hang, restart loop, or process-level availability failure.Crash, hang, restart loop, or process-level availability failure.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Metadata
Metadata
Assignees
Labels
P1High-priority user-facing bug, regression, or broken workflow.High-priority user-facing bug, regression, or broken workflow.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.ClawSweeper found a clear likely implementation shape for this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:crash-loopCrash, hang, restart loop, or process-level availability failure.Crash, hang, restart loop, or process-level availability failure.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Summary
session.maintenancecontrols (rotateBytes,maxDiskBytes,maxEntries) apply tosessions.json(the session index file) but have no effect on individual transcript.jsonlfiles. These files can grow without bound, eventually causing gateway CPU 100% and unresponsiveness.Environment
Current config (session.maintenance)
{ "pruneAfter": "30d", "maxEntries": 500, "rotateBytes": "10mb", "maxDiskBytes": "500mb", "highWaterBytes": "400mb", "mode": "enforce" }Observed behavior
With the above config, individual
.jsonltranscript files grew to:daily-devopstopic-2.jsonl → 113 MB (25,927 lines)dev-tlsession.jsonl → 266 MBdev-tlsession.jsonl → 56 MBdaily-collectortopic-8.jsonl → 36 MBThe
rotateBytes: "10mb"setting had zero effect on any of these files.Impact
When
daily-devops/topic-2.jsonlreached 113 MB, gateway entered CPU 100% and became completely unresponsive. Manual intervention was required: archive the file and restart gateway.Root cause
rotateBytesrotatessessions.json(the index), not transcript.jsonlfiles. There is no mechanism to cap or rotate individual transcript files. For long-running topic-bound sessions (group chats, forum topics), these files grow indefinitely because:sessions_list, file reads) are not truncated before writeThis is distinct from issue #18572 (which is about
sessions.jsonrotation race condition).Expected behavior
session.maintenanceshould include controls for transcript.jsonlfiles, such as:transcriptRotateBytes: rotate (archive) a transcript when it exceeds a size thresholdtranscriptMaxLines: cap lines per transcript filerotateBytesto transcripts as well, not justsessions.jsonWorkaround
Running a nightly cron that scans all agent session directories and archives
.jsonlfiles exceeding 10 MB.Related
sessions.jsonrotation race condition (different issue, same config surface)