Bug Description
_process_batch_worker() intentionally discards prompts whose trajectories have has_any_reasoning == False, but those prompt indices are never added to completed_prompts. On --resume, they are retried forever.
Affected files / lines
batch_runner.py:441-447
batch_runner.py:493-497
Why this is a bug
These prompts are not failures; they are explicit discard decisions. Because they are omitted from completed_in_batch, resume treats them as unfinished work and re-spends tokens on them indefinitely.
Minimal Reproduction
Monkeypatch _process_single_prompt() to return:
{
"success": True,
"trajectory": [{"role": "assistant", "content": "x"}],
"reasoning_stats": {"has_any_reasoning": False},
"tool_stats": {},
"metadata": {},
"completed": True,
"api_calls": 1,
"toolsets_used": [],
}
Then run:
out = _process_batch_worker((1, [(0, {"prompt": "hi"})], tmpdir, set(), {"verbose": False}))
print(out["discarded_no_reasoning"], out["completed_prompts"])
Observed output:
discarded_no_reasoning == 1
completed_prompts == []
Expected Behavior
Once a prompt is explicitly discarded for no reasoning, resume should treat that prompt as completed/dispositioned and not re-run it forever.
Actual Behavior
Discarded prompts never enter the completed set, so --resume retries them on every run.
Suggested Investigation Direction
Record discarded prompt indices separately (or include them in completed_prompts) so the resume checkpoint represents all terminal outcomes, not just saved trajectories.
Bug Description
_process_batch_worker()intentionally discards prompts whose trajectories havehas_any_reasoning == False, but those prompt indices are never added tocompleted_prompts. On--resume, they are retried forever.Affected files / lines
batch_runner.py:441-447batch_runner.py:493-497Why this is a bug
These prompts are not failures; they are explicit discard decisions. Because they are omitted from
completed_in_batch, resume treats them as unfinished work and re-spends tokens on them indefinitely.Minimal Reproduction
Monkeypatch
_process_single_prompt()to return:{ "success": True, "trajectory": [{"role": "assistant", "content": "x"}], "reasoning_stats": {"has_any_reasoning": False}, "tool_stats": {}, "metadata": {}, "completed": True, "api_calls": 1, "toolsets_used": [], }Then run:
Observed output:
discarded_no_reasoning == 1completed_prompts == []Expected Behavior
Once a prompt is explicitly discarded for no reasoning, resume should treat that prompt as completed/dispositioned and not re-run it forever.
Actual Behavior
Discarded prompts never enter the completed set, so
--resumeretries them on every run.Suggested Investigation Direction
Record discarded prompt indices separately (or include them in
completed_prompts) so the resume checkpoint represents all terminal outcomes, not just saved trajectories.