Skip to content

[RFC][v2] Host-verified todo completion transitions #2572

@GTC2080

Description

@GTC2080

Problem

After #2540 and #2569, complete_step can verify evidence receipts and match the claimed step against the current todo list. The remaining gap is the reverse direction: todo_write can still mark an item completed without the host confirming that a matching complete_step receipt happened in the same turn.

Proposal

Add a runtime-only todo completion transition guard. When a successful complete_step receipt exists in the current turn, a later todo_write that changes a matching todo item to completed may be accepted. A todo_write that newly marks an item completed without a matching successful complete_step receipt should be rejected.

Matching should follow the Phase 2 behavior: todo content, activeForm, or 1-based item number. The guard should stay in memory for the current turn only and should not write receipt data into prompts or persistent session state.

Non-goals

  • No UI changes.
  • No performance claims or optimization.
  • No auto-plan, multi-agent, goal, cache, MCP, or hook changes.
  • No prompt, tool schema, or tool list changes.
  • No persistence of receipt data.

Conflict check

This should avoid the current desktop UI, MCP, multi-agent, goal, cache, and PostLLMCall hook PR areas. The implementation should stay near the runtime evidence ledger, todo_write, complete_step, and focused agent flow tests.

Review evidence plan

PRs will link this RFC. Because this phase has no UI changes, screenshots are not applicable. Because it makes no performance claim and avoids prompt/tool-schema changes, cache/token/runtime metrics are not expected. If scope changes, the PR must include the required screenshots or performance data.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions