Skip to content

feat: add per-column progress logging to AsyncTaskScheduler #443

@andreatgretel

Description

@andreatgretel

Context

The sync _run_batch() path produces high-fidelity per-column progress logs (worker count, records/sec, ETA, emoji progression). The async AsyncTaskScheduler path only logs start/end and salvage rounds - no per-column progress during the main generation phase.

Example sync output:

⚡️ Processing llm-text column 'recipe_idea' with 4 concurrent workers
  |-- 🐴 llm-text column 'recipe_idea' progress: 1/3 (33%) complete, 1 ok, 0 failed, 1.54 rec/s, eta 1.3s
  |-- 🚗 llm-text column 'recipe_idea' progress: 2/3 (67%) complete, 2 ok, 0 failed, 3.05 rec/s, eta 0.3s

Async output during generation is silent.

Proposed approach

Reuse the existing ProgressTracker class:

  • Initialize dict[str, ProgressTracker] in AsyncTaskScheduler.__init__()
  • Wire record_success()/record_failure() in _execute_task_inner() after task completion
  • Emit log_final() when columns complete
  • ~100 lines, can be done incrementally (basic counts first, emoji/ETA polish later)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions