Skip to content

fix: mark discarded no-reasoning prompts as completed in batch_runner#10042

Closed
nightq wants to merge 3 commits into
NousResearch:mainfrom
nightq:fix/issue-9950-batch-runner-discard
Closed

fix: mark discarded no-reasoning prompts as completed in batch_runner#10042
nightq wants to merge 3 commits into
NousResearch:mainfrom
nightq:fix/issue-9950-batch-runner-discard

Conversation

@nightq

@nightq nightq commented Apr 15, 2026

Copy link
Copy Markdown

Summary

Fixes batch_runner --resume infinitely retrying prompts that were intentionally discarded for having no reasoning.

Root Cause

_process_batch_worker() discards prompts with has_any_reasoning == False by incrementing discarded_no_reasoning and continue-ing. However, these prompt indices were never added to completed_in_batch, so the checkpoint didn't record them as dispositioned. On --resume, they were treated as unfinished work and retried indefinitely.

Fix

Append the prompt index to completed_in_batch before the continue in the discard branch, so the checkpoint reflects the terminal outcome.

Test Plan

  • New regression test verifies discarded prompts appear in completed_prompts
  • Existing checkpoint tests still pass

Closes #9950

nightq added 3 commits April 15, 2026 11:41
Fixes NousResearch#9999

Root cause: _sanitize_api_messages compared raw tool_call_id strings
without stripping whitespace, causing valid tool results to be treated
as orphaned when IDs had leading/trailing spaces.
Fix: strip whitespace in _get_tool_call_id_static and when collecting
result_call_ids from tool messages.
Fixes NousResearch#9980

Root cause: _send_raw_message hardcoded 'chat_id' as receive_id_type,
causing [230001] invalid receive_id errors when sending to user open_ids
(prefix 'ou_') or union_ids (prefix 'on_').
Fix: Add _detect_receive_id_type() that checks ID prefix (oc_→chat_id,
ou_→open_id, on_→union_id) and use it in _send_raw_message.
Fixes NousResearch#9950

Root cause: Prompts discarded for having no reasoning were not added to
completed_in_batch, causing --resume to retry them indefinitely.
Fix: Append prompt_index to completed_in_batch before continuing past
the discard branch.
@teknium1

Copy link
Copy Markdown
Contributor

Closed in favor of PR #12997 #12997 which fixes the same issue. The Feishu and tool_call_id changes in your PR are separate concerns — consider submitting them as individual PRs. Thanks @nightq!

@teknium1 teknium1 closed this Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

batch_runner retries intentionally discarded no-reasoning prompts on every --resume

2 participants