fix(agent): close complete_step cross-turn evidence + loop gaps by esengine · Pull Request #4014 · esengine/DeepSeek-Reasonix

esengine · 2026-06-11T10:09:56Z

Why

complete_step's enforcement is bound to a per-turn evidence ledger (a.evidence.Reset() runs at the top of every Run). Two consequences users have hit:

Cross-turn evidence is rejected. Command evidence already falls back to scanning the full session (fix(agent): reduce evidence verification false signals #3587), but diff/files (path) evidence did not — so citing a file you edited in an earlier turn (or before compaction) failed with "no matching writer receipt in this turn".
The loop doesn't close across turns/compaction. The final-answer gate reads only this turn's todo_write, so once the structured list is gone (a later turn, or compacted into a summary) a premature "all done" slips through — the unclosed loop in [Bug]: todo_write / complete_step 失败后，Agent 仍允许最终回答“全部完成” #2917.

What

diff/files session fallback (mirrors fix(agent): reduce evidence verification false signals #3587). PathsProvenInSession resolves a path against any successful (non-errored) write/read tool call in the transcript; verifyStepEvidence consults it when the per-turn ledger misses. Fabricated paths are still rejected.
Host-side canonical todo list. Agent.todoState is the latest todo_write with completions applied, rebuilt from the session on load/rewind (latest todo_write + replayed complete_steps). It never rides in the prompt, so it survives turn boundaries and compaction. The final gate falls back to it when a turn did work without a todo_write, so a premature "all done" is blocked until the plan is genuinely finished — bounded by the existing maxFinalReadinessBlocks.
complete_step advances the list. A successful sign-off marks the matching step completed, promotes the next, records a synthetic todo_write for the in-turn gate, and emits an event so the panel updates. The model is told it no longer needs a todo_write to mark completions — which removes the manual batch-completion step that [Bug] Agent批量提交complete_step导致todo_write校验失败——违反串行工作模式 #3909 tripped over (planApprovedMessage / executor handoff / tool copy updated to match).

All host-side, zero added tokens.

Tests

Unit: cross-turn diff/files fallback, canonical gate fallback, advance + promote, rebuild (incl. skipping failed complete_steps).

End-to-end (real complete_step/todo_write builtins driven through Agent.Run across turns):

serial plan, host auto-advances, no batch todo_write, final answer allowed
cd-prefixed command drift accepted in-turn
cross-turn diff evidence accepted via session fallback (and an unbacked path still rejected — asserted at the tool-result level, not just Run exit)
cross-turn canonical gate blocks a premature "all done", then clears once the steps are actually signed off

go test ./... green; gofmt/go vet clean.

Closes #2917

The evidence ledger resets every turn, so a complete_step citing work from an earlier turn (or after compaction) was rejected, and the final-answer gate — reading only this turn's todo_write — let a stale plan slip past with an "all done". - diff/files evidence now falls back to the full session like command evidence already did (#3587): a cross-turn citation of a written or read file is honored, fabrication still rejected. - the host keeps a canonical todo list (latest todo_write + replayed complete_steps, rebuilt on session load/rewind) that survives turn boundaries and compaction; the final gate consults it when the turn did work without a todo_write, so a premature "all done" is blocked until the plan is actually finished. - a successful complete_step advances that list (marks the step done, promotes the next, records it for the in-turn gate) and the model is told it no longer needs a todo_write to mark completions — removing the batch-completion step #3909 hit. Adds unit + end-to-end coverage (serial host-advance, command drift, cross-turn diff via session, cross-turn gate block-then-clear, rejection). Closes #2917

esengine requested a review from SivanCola as a code owner June 11, 2026 10:09

github-actions Bot added v2 Go rewrite (1.x) — main-v2 branch, active development skills Skill system (internal/skill, internal/tool) agent Core agent loop (internal/agent, internal/control) labels Jun 11, 2026

esengine merged commit 6dee96c into main-v2 Jun 11, 2026
14 checks passed

esengine deleted the fix/complete-step-cross-turn-loop branch June 11, 2026 10:27

This was referenced Jun 12, 2026

fix(agent): use unique IDs for host-advance todo_write events to prevent frontend panel merge #4132

Merged

fix(agent): persist canonical todo state across tab switches and restarts #4159

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): close complete_step cross-turn evidence + loop gaps#4014

fix(agent): close complete_step cross-turn evidence + loop gaps#4014
esengine merged 1 commit into
main-v2from
fix/complete-step-cross-turn-loop

esengine commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

esengine commented Jun 11, 2026

Why

What

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant