Summary
A Discord /goal task completed a narrowly scoped file-editing request and produced a final answer saying the requested scope was done. Immediately afterward, the goal judge returned continue. That synthetic continuation raced with automatic skill-library maintenance and preflight context compression/session split.
The agent then resumed work beyond the explicit scope, edited additional content, and compression preserved/reintroduced a stale active task list. The user had to interrupt and ask why the agent was still working.
This appears to be a lifecycle/compression amplification of a /goal judge false continuation: the judge made a bad continue decision, and compression/session split made the stale continuation state durable.
Expected behavior
- If the assistant has completed the scoped user request and the todo list is complete,
/goal should not synthesize a new continuation unless there is an explicit remaining requirement.
- A git file being
untracked should not imply the goal is incomplete unless the user asked for staging, commit, push, or clean working tree.
- Preflight compression/session split should not preserve a synthetic continuation or active task list that contradicts the latest completed final answer.
- Background maintenance turns should not race with goal continuation/compression against the same session lineage.
Actual behavior
- User asked the agent to:
- modify a documentation file in the repo;
- add a new empty column to story tables;
- inspect and annotate only one specified section;
- not modify the outbox copy;
- not fix implementation, only annotate.
- The agent completed exactly that scope and sent a final answer:
- repo file modified;
- outbox not modified;
- only the requested section annotated;
- checks passed;
- todo list fully completed.
- Immediately after the final response, Hermes started an automatic background skill-library maintenance turn.
- That maintenance turn triggered preflight compression.
- The goal judge decided
continue, apparently because the repo file was still untracked, although commit/staging was not part of the user request.
- Gateway injected a synthetic
[Continuing toward your standing goal] message.
- A second preflight compression started for this continuation while the previous compression was already in flight.
- The compressed handoff summary said active task was
None, but the synthetic continuation still caused the agent to keep working.
- The agent annotated additional sections beyond the requested one.
- Later compression/session split preserved an active task list for this extra out-of-scope work.
- The user interrupted and the assistant acknowledged it had continued beyond scope.
Sanitized timeline
1. Scoped request completed
20:31:57 [S1] Turn ended: reason=text_response(finish_reason=stop)
api_calls=18/60 budget=14/60 tool_turns=29
last_msg_role=assistant response_len=1504 session=S1
Final answer shape:
Done in the repo file.
The outbox file was not changed.
What was modified:
- Added the new column to the story tables.
- Kept the column empty in other sections.
- Filled/annotated only the requested section.
Checks:
- All rows have the expected columns.
- Only the requested rows have annotations.
- Remaining rows are empty.
- git diff --check passed.
- Git status: file is untracked.
Todo state immediately before final answer:
{
"todos": [
{"id": "edit-columns", "status": "completed"},
{"id": "inspect-requested-section", "status": "completed"},
{"id": "annotate-requested-section", "status": "completed"},
{"id": "verify", "status": "completed"}
],
"summary": {"pending": 0, "in_progress": 0, "completed": 4}
}
2. Background maintenance + compression begins
20:31:57 [S1] conversation turn: msg='Review the conversation above and update the skill library. Be ACTIVE ...'
20:31:57 [S1] Preflight compression: ~128,217 tokens >= 122,400 threshold
20:31:57 [S1] context compression started: session=S1 messages=73 tokens=~128,217 model=gpt-5.5 focus=None
20:31:57 [S1] Auxiliary compression: using anthropic (claude-haiku-4-5-20251001)
3. Goal judge incorrectly continues
20:31:58 hermes_cli.goals: goal judge: verdict=continue reason=A resposta indica que a revisão da seção solicitada foi feita, mas o próprio arquivo do repo ainda está untracked, então não há… [truncated]
The untracked reasoning is wrong for this task: the user did not ask for git add, commit, or push.
4. Synthetic continuation races with compression
20:31:58 gateway.run: inbound message: platform=discord user=<user> chat=<redacted> msg='[Continuing toward your standing goal] Goal: ...'
20:31:58 [S1] conversation turn: msg='[User] [Continuing toward your standing goal] Goal: ... Continue working toward this goal. Take the next concrete step...'
20:31:58 [S1] Preflight compression: ~125,517 tokens >= 122,400 threshold
20:31:58 [S1] context compression started: session=S1 messages=73 tokens=~125,517 model=gpt-5.5 focus=None
5. Compression creates contradictory state
20:32:24 [S1] context compression done: session=S1 messages=73->8 tokens=~31,451
20:32:30 [S1] context compression done: session=S2 messages=73->8 tokens=~29,634
Compressed handoff summary later visible in the new session:
[CONTEXT COMPACTION — REFERENCE ONLY]
## Active Task
None. User completed the requested section validation and paused; no new task has been assigned.
## In Progress
None.
But the synthetic continuation was also present immediately after that summary.
6. Agent resumes out-of-scope work
Instead of stopping, the agent continued into other sections and replaced the todo list with new out-of-scope work:
{
"todos": [
{"id": "inspect-later-sections", "status": "in_progress"},
{"id": "annotate-later-sections", "status": "pending"},
{"id": "verify-later-sections", "status": "pending"}
]
}
7. Later compression preserves the wrong active task list
20:35:37 [S2] context compression started: session=S2 messages=76 tokens=~122,917 model=gpt-5.5 focus=None
20:36:04 [S2] context compression done: session=S3 messages=76->77 tokens=~75,345
New session contained:
[Your active task list was preserved across context compression]
- [>] inspect-later-sections ... (in_progress)
- [ ] annotate-later-sections ... (pending)
- [ ] verify-later-sections ... (pending)
8. User interruption
20:36:26 [S3] Turn ended: reason=interrupted_during_api_call
20:36:26 gateway.run: Session split detected: S1 → S3 (compression)
The assistant then acknowledged:
The original request was already completed.
I continued beyond the scope and started inspecting/annotating later sections. That was my error.
Impact
- Assistant continued after a completed final response.
- Extra file edits were made outside the requested scope.
- A synthetic continuation contradicted the compressed summary saying there was no active task.
- Compression/session split preserved a wrong active task list.
- User had to interrupt to stop out-of-scope work.
Suspected contributing causes
- Goal judge treated
git status: untracked as evidence of incompletion even though staging/commit was not required.
- Goal continuation and background skill-library maintenance can both start turns against the same session lineage.
- Preflight compression can serialize synthetic continuation state while the latest assistant answer says the task is complete.
- Active todo state created after an invalid synthetic continuation can be preserved across compression.
- There may be missing consumed/ack state for synthetic continuation/recovery messages.
Proposed fixes / invariants
- Do not infer “goal incomplete” from untracked/modified files unless the goal explicitly required staging/commit/push or clean working tree.
- If the latest assistant answer reports completion and todo state has no pending/in-progress items, require a stronger reason before queuing
/goal continuation.
- Serialize/lock session turns during preflight compression so background maintenance and goal continuation cannot both compress/use the same parent history concurrently.
- When compression creates a child session, reconcile synthetic
/goal continuations against the compressed Active Task summary and latest final answer.
- Do not preserve active todo state created by a synthetic continuation if the premise conflicts with a completed final answer.
- Add consumed-state markers for synthetic continuation/recovery messages so stale inferred work is not replayed as fresh work.
Related issues
Summary
A Discord
/goaltask completed a narrowly scoped file-editing request and produced a final answer saying the requested scope was done. Immediately afterward, the goal judge returnedcontinue. That synthetic continuation raced with automatic skill-library maintenance and preflight context compression/session split.The agent then resumed work beyond the explicit scope, edited additional content, and compression preserved/reintroduced a stale active task list. The user had to interrupt and ask why the agent was still working.
This appears to be a lifecycle/compression amplification of a
/goaljudge false continuation: the judge made a badcontinuedecision, and compression/session split made the stale continuation state durable.Expected behavior
/goalshould not synthesize a new continuation unless there is an explicit remaining requirement.untrackedshould not imply the goal is incomplete unless the user asked for staging, commit, push, or clean working tree.Actual behavior
continue, apparently because the repo file was stilluntracked, although commit/staging was not part of the user request.[Continuing toward your standing goal]message.None, but the synthetic continuation still caused the agent to keep working.Sanitized timeline
1. Scoped request completed
Final answer shape:
Todo state immediately before final answer:
{ "todos": [ {"id": "edit-columns", "status": "completed"}, {"id": "inspect-requested-section", "status": "completed"}, {"id": "annotate-requested-section", "status": "completed"}, {"id": "verify", "status": "completed"} ], "summary": {"pending": 0, "in_progress": 0, "completed": 4} }2. Background maintenance + compression begins
3. Goal judge incorrectly continues
The
untrackedreasoning is wrong for this task: the user did not ask forgit add, commit, or push.4. Synthetic continuation races with compression
5. Compression creates contradictory state
Compressed handoff summary later visible in the new session:
But the synthetic continuation was also present immediately after that summary.
6. Agent resumes out-of-scope work
Instead of stopping, the agent continued into other sections and replaced the todo list with new out-of-scope work:
{ "todos": [ {"id": "inspect-later-sections", "status": "in_progress"}, {"id": "annotate-later-sections", "status": "pending"}, {"id": "verify-later-sections", "status": "pending"} ] }7. Later compression preserves the wrong active task list
New session contained:
8. User interruption
The assistant then acknowledged:
Impact
Suspected contributing causes
git status: untrackedas evidence of incompletion even though staging/commit was not required.Proposed fixes / invariants
/goalcontinuation./goalcontinuations against the compressedActive Tasksummary and latest final answer.Related issues
/goalstate and session-id migration across compression. Related/goal+ compression lifecycle./goalcan continue after terminal-ish answers when judge handling goes wrong.