Skip to content

perf: parallelize _get_base_sources git show calls with TaskGroup #857

@Aureliolo

Description

@Aureliolo

Summary

PlannerWorktreeStrategy._get_base_sources (git_worktree.py) runs sequential git show calls for each changed file. With max_files=50+, this adds measurable latency.

Finding source

Pre-PR review agents (python-reviewer, async-concurrency-reviewer) flagged this during #611 review.

Proposed fix

Use asyncio.TaskGroup with a bounded semaphore to parallelize the git show calls. The method runs under self._lock so no race conditions with other operations.

async with asyncio.TaskGroup() as tg:
    for file_path in files:
        tg.create_task(_fetch(file_path))

Context

  • CLAUDE.md: "prefer asyncio.TaskGroup for fan-out/fan-in parallel operations in new code"
  • Same optimization applies to CompositeSemanticAnalyzer running analyzers sequentially

Metadata

Metadata

Assignees

No one assigned

    Labels

    scope:smallLess than 1 day of worktype:perfPerformance optimizationv0.5Minor version v0.5

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions