[ci] Cap parallel test concurrency and fix fragile test timeouts by petebacondarwin · Pull Request #13604 · cloudflare/workers-sdk

petebacondarwin · 2026-04-20T07:50:39Z

Follow-up fix for #13596 and #13601, which are causing CI failures.

Problem

Removing --concurrency=1 entirely (unlimited parallelism) caused CPU starvation on GitHub Actions runners (4 vCPUs, 16GB RAM):

Fixtures: Turbo launched 23+ fixture tests simultaneously, each spawning workerd processes. start-worker-node (which normally takes 1.5s) timed out at 15s on all 3 OS platforms.
Packages: wrangler (forks pool, 233 files), miniflare (workerd), vitest-pool-workers (Verdaccio + workerd), and workers-shared (workerd pool) all ran simultaneously. The asset-worker "20,000 entry manifest" test timed out at 5s, and the vite-plugin HMR events test timed out at 5s.

Every CI failure was a test timeout caused by CPU starvation — no correctness issues, no shared state conflicts.

Root Causes

1. Too many concurrent tasks. Unlimited concurrency on a 4-vCPU runner is too aggressive.

2. Several test configs have fragile default timeouts. Six package vitest configs don't extend vitest.shared.ts (which provides 50s timeout), so they silently use Vitest's default 5000ms. One fixture uses a hardcoded 15000ms. These were pre-existing bugs that just didn't manifest with serial execution.

Changes

Concurrency limits

Fixtures: --concurrency=4 (was unlimited → 4 concurrent turbo tasks)
Packages: --concurrency=3 (was unlimited → 3 concurrent turbo tasks)

Timeout fixes (6 vitest configs + 1 fixture)

Added testTimeout: 50_000 to match the repo standard from vitest.shared.ts:

packages/workers-shared/asset-worker/vitest.config.mts
packages/workers-shared/router-worker/vitest.config.mts
packages/vite-plugin-cloudflare/vitest.config.ts
packages/edge-preview-authenticated-proxy/vitest.config.mts
packages/kv-asset-handler/vitest.config.mts
packages/pages-shared/vitest.config.mts
fixtures/start-worker-node-test/package.json (15s → 50s)

Validation

Ran both test suites locally 3 times each with the new concurrency limits:

Suite	Run 1	Run 2	Run 3
Fixtures (--concurrency=4)	76/76, 1m50s	76/76, 1m55s	76/76, 1m42s
Packages (--concurrency=3)	34/34, 2m08s	34/34, 3m29s	34/34, 2m06s

Expected CI Times

Job	Before (serial)	After (capped parallel)
Linux fixtures	~10 min	~3-4 min
Linux packages	~9.5 min	~5-6 min
Windows packages	~20 min	~10-12 min

Tests
- Tests included/updated
- Automated tests not possible - manual testing has been completed as follows:
- Additional testing not necessary because:
Public documentation
- Cloudflare docs PR(s):
- Documentation not necessary because: CI/test config changes, no user-facing impact

Follow-up to #13596 and #13601 which removed --concurrency=1 entirely. Unlimited concurrency caused CPU starvation on 4-vCPU CI runners when 20+ workerd-spawning fixtures or 3+ heavyweight package suites ran simultaneously, leading to test timeouts. Changes: - Cap fixture concurrency at 4 (was unlimited, previously 1) - Cap package concurrency at 3 (was unlimited, previously 1) - Add testTimeout: 50_000 to 6 vitest configs that were using Vitest's default 5000ms instead of the repo standard 50s from vitest.shared.ts: workers-shared/asset-worker, workers-shared/router-worker, vite-plugin-cloudflare, edge-preview-authenticated-proxy, kv-asset-handler, pages-shared - Increase start-worker-node-test timeout from 15s to 50s The timeout fixes address pre-existing fragility - these configs never extended vitest.shared.ts and relied on Vitest's 5s default, which is insufficient under any CPU load.

changeset-bot · 2026-04-20T07:50:45Z

🦋 Changeset detected

Latest commit: 1b58095

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

workers-devprod · 2026-04-20T07:51:04Z

Codeowners approval required for this PR:

@cloudflare/deploy-config
✅ @cloudflare/wrangler

Show detailed file reviewers

packages/workers-shared/asset-worker/vitest.config.mts: [@cloudflare/deploy-config]
packages/workers-shared/router-worker/vitest.config.mts: [@cloudflare/deploy-config]

ask-bonk · 2026-04-20T07:51:09Z

UnknownError: ProviderInitError

github run

ask-bonk · 2026-04-20T07:51:11Z

@petebacondarwin Bonk workflow failed. Check the logs for details.

View workflow run · To retry, trigger Bonk again.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

pkg-pr-new · 2026-04-20T07:55:57Z

create-cloudflare

npm i https://pkg.pr.new/create-cloudflare@13604

@cloudflare/kv-asset-handler

npm i https://pkg.pr.new/@cloudflare/kv-asset-handler@13604

miniflare

npm i https://pkg.pr.new/miniflare@13604

@cloudflare/pages-shared

npm i https://pkg.pr.new/@cloudflare/pages-shared@13604

@cloudflare/unenv-preset

npm i https://pkg.pr.new/@cloudflare/unenv-preset@13604

@cloudflare/vite-plugin

npm i https://pkg.pr.new/@cloudflare/vite-plugin@13604

@cloudflare/vitest-pool-workers

npm i https://pkg.pr.new/@cloudflare/vitest-pool-workers@13604

@cloudflare/workers-editor-shared

npm i https://pkg.pr.new/@cloudflare/workers-editor-shared@13604

wrangler

npm i https://pkg.pr.new/wrangler@13604

commit: 1b58095

Windows CI runners are significantly slower under parallel load. With --concurrency=4, start-worker-node timed out at 50s on Windows despite normally taking ~1.5s. Lowering to 2 avoids CPU starvation on the slowest platform while still providing ~2x speedup over serial.

The start-worker-node fixture's worker.dispose() cleanup hangs under any parallel load on Windows, causing the file-level node:test timeout to fire (50s) even though all 5 individual tests pass in <1s. This is a Windows-specific issue with workerd process cleanup under CPU pressure. Linux and macOS provide sufficient coverage for the startWorker API.

…13603) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Wrangler automated PR updater <wrangler@cloudflare.com>

dario-piotrowicz

Codeowners Bypass

workers-devprod

Codeowners reviews satisfied

Revert the Windows-only skip — the test also fails on Linux under parallel load. The issue is that worker.dispose() cleanup hangs when any other fixture runs concurrently, causing node:test's file-level timeout to fire even though all 5 individual tests complete in <1s. Run it as a separate step after the parallel fixtures complete, so it gets the full CPU without contention.

… fixed The previous PR #13604 worked around an intermittent flake in the `start-worker-node-test` fixture by: 1. Running the fixture as its own CI step, serialised from all other fixtures (because parallel load made the cleanup hang more likely). 2. Bumping the `node --test` file-level timeout from 15s to 50s so that node:test would sometimes let the subprocess finish before cancelling it. Neither addressed the underlying cause. The preceding commit fixes the actual esbuild resource leak in `unstable_startWorker` teardown, so the workaround can go: - Re-include `./fixtures/start-worker-node-test` in the main fixtures test run (restores parallelism). - Delete the separate `Run tests (start-worker-node)` step. - Drop the node:test timeout back to 15s.

petebacondarwin added the ci:no-changeset-required Skip pull request checks for a valid changeset label Apr 20, 2026

petebacondarwin requested a review from workers-devprod as a code owner April 20, 2026 07:50

workers-devprod added this to workers-sdk Apr 20, 2026

github-project-automation Bot moved this to Untriaged in workers-sdk Apr 20, 2026

workers-devprod requested review from a team and dario-piotrowicz and removed request for a team April 20, 2026 07:51

devin-ai-integration Bot reviewed Apr 20, 2026

View reviewed changes

dario-piotrowicz approved these changes Apr 20, 2026

View reviewed changes

petebacondarwin and others added 2 commits April 20, 2026 09:33

dario-piotrowicz approved these changes Apr 20, 2026

View reviewed changes

workers-devprod approved these changes Apr 20, 2026

View reviewed changes

github-project-automation Bot moved this from Untriaged to Approved in workers-sdk Apr 20, 2026

petebacondarwin merged commit d8314c6 into main Apr 20, 2026
59 of 61 checks passed

petebacondarwin deleted the pbacondarwin/fix-parallel-ci-timeouts branch April 20, 2026 09:38

github-project-automation Bot moved this from Approved to Done in workers-sdk Apr 20, 2026

workers-devprod mentioned this pull request Apr 20, 2026

Version Packages #13511

Merged

petebacondarwin mentioned this pull request Apr 24, 2026

[wrangler] Fix start-worker-node (Windows fixtures) and tail (Linux packages) CI flakes #13662

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ci] Cap parallel test concurrency and fix fragile test timeouts#13604

[ci] Cap parallel test concurrency and fix fragile test timeouts#13604
petebacondarwin merged 5 commits into
mainfrom
pbacondarwin/fix-parallel-ci-timeouts

petebacondarwin commented Apr 20, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

changeset-bot Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

workers-devprod commented Apr 20, 2026 •

edited

Loading

Uh oh!

ask-bonk Bot commented Apr 20, 2026

Uh oh!

ask-bonk Bot commented Apr 20, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

pkg-pr-new Bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

dario-piotrowicz left a comment

Uh oh!

workers-devprod left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

petebacondarwin commented Apr 20, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Root Causes

Changes

Concurrency limits

Timeout fixes (6 vitest configs + 1 fixture)

Validation

Expected CI Times

Uh oh!

changeset-bot Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

workers-devprod commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ask-bonk Bot commented Apr 20, 2026

Uh oh!

ask-bonk Bot commented Apr 20, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

pkg-pr-new Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dario-piotrowicz left a comment

Choose a reason for hiding this comment

Uh oh!

workers-devprod left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

petebacondarwin commented Apr 20, 2026 •

edited by devin-ai-integration Bot

Loading

changeset-bot Bot commented Apr 20, 2026 •

edited

Loading

workers-devprod commented Apr 20, 2026 •

edited

Loading

pkg-pr-new Bot commented Apr 20, 2026 •

edited

Loading