Skip to content

Optimize CI integration test matrix for better balance#6599

Merged
pelikhan merged 2 commits intomainfrom
copilot/optimize-ci-integration-test-matrix
Dec 16, 2025
Merged

Optimize CI integration test matrix for better balance#6599
pelikhan merged 2 commits intomainfrom
copilot/optimize-ci-integration-test-matrix

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Dec 16, 2025

Rebalances the CI integration test matrix to reduce the critical path from 76.77s to ~46s (39% improvement) by addressing severe group imbalances and duplicate test execution.

Changes

1. Isolate slow test into dedicated group

  • New "CLI Progress Flag" group for TestProgressFlagSignature (30s test)
  • Frees "CLI Completion & Safe Inputs" catch-all from 76.77s → ~46s

2. Fix duplicate test execution

# Before: TestCompileWorkflows runs in multiple groups
pattern: "TestCompile|TestPoutine"

# After: Excludes TestCompileWorkflows 
pattern: "^TestCompile[^W]|TestPoutine"

Eliminates ~20s of wasted execution per run.

3. Split large catch-all group

  • "Workflow Misc" (61s, 6736 tests) → two balanced groups:
    • "Workflow Misc Part 1": Specific patterns (TestAgent, TestCopilot, TestCustom, etc.)
    • "Workflow Misc Part 2": Remaining tests (maintains catch-all coverage)
  • Better parallelization: 61s → 2×~30s groups

Impact

  • Matrix groups: 21 → 23
  • Max group duration: 76.77s → ~46s (39% reduction)
  • Catch-all groups preserved: 4 (ensures 100% test coverage)
  • Duplicate execution eliminated: ~20s per run
Original prompt

This section details on the original issue you should resolve

<issue_title>[ci-coach] Optimize CI integration test matrix for better balance</issue_title>
<issue_description>## CI Optimization: Integration Test Matrix Rebalancing

Summary

This PR optimizes the CI integration test matrix to reduce the critical path by addressing severe imbalances. The changes reduce the maximum group duration from 76.77s to an estimated 46s (39% improvement) while adding better test isolation and eliminating duplicate test execution.

Analysis Results

Baseline Metrics (from last 100 CI runs):

  • Success rate: 35% (indicating room for improvement)
  • Unit test duration: ~115s
  • Integration test duration: ~243s across 21 matrix groups
  • Problem: Maximum group duration was 76.77s while minimum was 0s (severe imbalance)

Integration Matrix Imbalance Issues:

Group                                     Duration    % of Total  Tests
CLI Completion & Safe Inputs              76.77s     31.6%       1753 (catch-all)
Workflow Misc                             61.04s     25.1%       6736 (catch-all)
CLI Compile & Poutine                     22.87s      9.4%         89
[...other groups 0-15s...]

Key Problems Identified:

  1. ⚠️ Catch-all groups too large: "CLI Completion & Safe Inputs" (76.77s) is 76x longer than smallest group
  2. ⚠️ Duplicate test execution: TestProgressFlagSignature (30s) runs in multiple groups
  3. ⚠️ Poor pattern matching: TestCompileWorkflows matches multiple patterns, running 3 times

Optimizations

1. Isolate Slow Test (New Group)

Type: Matrix Rebalancing
Impact: Reduces "CLI Completion & Safe Inputs" from 76.77s to ~46s (39% reduction)
Risk: Low

Changes:

  • Line 90-92: Added new "CLI Progress Flag" group for TestProgressFlagSignature

Rationale:
TestProgressFlagSignature takes 30+ seconds alone and was running as part of the catch-all "CLI Completion & Safe Inputs" group. Isolating it allows the catch-all group to run ~39% faster, improving overall matrix balance.

Before:

- name: "CLI Completion & Safe Inputs"
  packages: "./pkg/cli"
  pattern: ""  # Includes TestProgressFlagSignature (30s)
  # Duration: 76.77s

After:

- name: "CLI Progress Flag"  # NEW - Isolate slow 30s test
  packages: "./pkg/cli"
  pattern: "TestProgressFlagSignature"
  # Expected duration: ~30s

- name: "CLI Completion & Safe Inputs"
  packages: "./pkg/cli"
  pattern: ""  # Now excludes TestProgressFlagSignature
  # Expected duration: ~46s (39% reduction)

Benefits:

  • Better parallelization (slow test runs independently)
  • Reduces wait time for other integration jobs
  • Makes catch-all group more predictable

2. Fix Duplicate Test Execution

Type: Test Deduplication
Impact: Eliminates ~20s of wasted CI time from duplicate runs
Risk: Low

Changes:

  • Line 77: Changed pattern from TestCompile|TestPoutine to ^TestCompile[^W]|TestPoutine

Rationale:
The pattern TestCompile|TestPoutine was matching TestCompileWorkflows* tests, causing them to run in:

  1. "CLI Compile & Poutine" group (9.98s)
  2. "CLI Completion & Safe Inputs" group (9.51s)
  3. Other catch-all groups

This wastes CI time and provides no additional test coverage.

Before:

- name: "CLI Compile & Poutine"
  pattern: "TestCompile|TestPoutine"  # Matches TestCompileWorkflows too

After:

- name: "CLI Compile & Poutine"
  pattern: "^TestCompile[^W]|TestPoutine"  # Excludes TestCompileWorkflows

The regex ^TestCompile[^W] matches "TestCompile" at the start (^) followed by any character except "W" ([^W]), thus excluding "TestCompileWorkflows" while including other "TestCompile*" tests.

Benefits:

  • Eliminates duplicate test execution (~20s saved per run)
  • Each test runs exactly once in its intended group
  • Clearer test organization

3. Split Large Catch-All Group

Type: Matrix Rebalancing
Impact: Reduces "Workflow Misc" from 61.04s to ~30s per group
Risk: Low

Changes:

  • Lines 138-143: Split "Workflow Misc" into two groups with specific patterns

Rationale:
"Workflow Misc" was a catch-all containing 6,736 tests (25.1% of all integration tests) taking 61.04s. Splitting it into two groups allows better parallelization.

Before:

- name: "Workflow Misc"
  packages: "./pkg/workflow"
  pattern: ""  # All remaining workflow tests
  # Duration: 61.04s, 6736 tests

After:

- name: "Workflow Misc Part 1"  # Common test patterns
  packages: "./pkg/workflow"
  pattern: "TestAgent|TestCopilot|TestCustom|TestEngine|TestModel|TestNetwork|TestOpenAI|TestProvider"
  # Expected: ~30s

- name: "Workflow Misc Part 2"  # Catch-all for remaining
  packages: "./pkg/workflow"
  pattern: ""
  # Expected: ~30s

Benefits:

  • Better matrix balance (two 30s groups vs one 61s group)
  • Parallel execution reduces overall duration
  • Named patterns make test organizatio...

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Copilot AI changed the title [WIP] Optimize CI integration test matrix for better balance Optimize CI integration test matrix for better balance Dec 16, 2025
Copilot AI requested a review from mnkiefer December 16, 2025 07:27
@pelikhan pelikhan marked this pull request as ready for review December 16, 2025 10:56
@pelikhan pelikhan merged commit 5e93a81 into main Dec 16, 2025
4 checks passed
@pelikhan pelikhan deleted the copilot/optimize-ci-integration-test-matrix branch December 16, 2025 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ci-coach] Optimize CI integration test matrix for better balance

3 participants