Context
The sync _run_batch() path produces high-fidelity per-column progress logs (worker count, records/sec, ETA, emoji progression). The async AsyncTaskScheduler path only logs start/end and salvage rounds - no per-column progress during the main generation phase.
Example sync output:
⚡️ Processing llm-text column 'recipe_idea' with 4 concurrent workers
|-- 🐴 llm-text column 'recipe_idea' progress: 1/3 (33%) complete, 1 ok, 0 failed, 1.54 rec/s, eta 1.3s
|-- 🚗 llm-text column 'recipe_idea' progress: 2/3 (67%) complete, 2 ok, 0 failed, 3.05 rec/s, eta 0.3s
Async output during generation is silent.
Proposed approach
Reuse the existing ProgressTracker class:
- Initialize
dict[str, ProgressTracker] in AsyncTaskScheduler.__init__()
- Wire
record_success()/record_failure() in _execute_task_inner() after task completion
- Emit
log_final() when columns complete
- ~100 lines, can be done incrementally (basic counts first, emoji/ETA polish later)
Related
Context
The sync
_run_batch()path produces high-fidelity per-column progress logs (worker count, records/sec, ETA, emoji progression). The asyncAsyncTaskSchedulerpath only logs start/end and salvage rounds - no per-column progress during the main generation phase.Example sync output:
Async output during generation is silent.
Proposed approach
Reuse the existing
ProgressTrackerclass:dict[str, ProgressTracker]inAsyncTaskScheduler.__init__()record_success()/record_failure()in_execute_task_inner()after task completionlog_final()when columns completeRelated