You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a long session (200+ turns, lots of tool calls), three Vecs grow unbounded:
app.history (HistoryCell) — every user / assistant message + every tool result rendered in the transcript.
app.api_messages (Message) — every message that gets sent to the LLM next request.
app.tool_log (Vec) — every tool invocation logged for debugging.
Each grows linearly with turn count. On a 4-hour coding session with 500 tool calls, this can climb past 1GB resident memory before the user notices.
The compaction flow trims api_messages for LLM-bound context, but history and tool_log aren't bounded.
Proposed solution
A few independent measures:
Bound history. Cap at N entries (default 5000). When exceeded, drop oldest 500 entries; replace with a synthetic "[N earlier messages truncated — use /sessions to load full history]" cell.
Bound tool_log. Ring buffer at 500 entries. Older entries archived to ~/.deepseek/sessions/<id>/tool-log.jsonl if the user wants to grep them later.
Problem
In a long session (200+ turns, lots of tool calls), three
Vecs grow unbounded:app.history(HistoryCell) — every user / assistant message + every tool result rendered in the transcript.app.api_messages(Message) — every message that gets sent to the LLM next request.app.tool_log(Vec) — every tool invocation logged for debugging.Each grows linearly with turn count. On a 4-hour coding session with 500 tool calls, this can climb past 1GB resident memory before the user notices.
The
compactionflow trimsapi_messagesfor LLM-bound context, buthistoryandtool_logaren't bounded.Proposed solution
A few independent measures:
Bound
history. Cap at N entries (default 5000). When exceeded, drop oldest 500 entries; replace with a synthetic "[N earlier messages truncated — use /sessions to load full history]" cell.Bound
tool_log. Ring buffer at 500 entries. Older entries archived to~/.deepseek/sessions/<id>/tool-log.jsonlif the user wants to grep them later.Working-set summary should be persisted, not held in memory. The
working_setsummary computation already exists (fix(cache): drop volatile fields from working_set summary block (#280) #287) but the input data is held forever. After cutoff, drop the input and keep only the rendered summary.Bench. Add a stress test: simulate 1000 turns and measure RSS. Pin it as a regression test.
Acceptance criteria
history,tool_logcap behavior implemented and tested.~/.deepseek/sessions/<id>/tool-log.jsonlwritten on cap-evict.crates/tui/tests/simulates 1000 turns and asserts RSS stays below a configurable ceiling (e.g., 256 MB).Related
crates/tui/src/compaction.rs— existing context-compaction mechanism.crates/tui/src/working_set.rs— working-set summarization.