Skip to content

ci(e2e): switch to persistent stop/start model#487

Merged
DorianZheng merged 1 commit into
mainfrom
ci/persistent-runner
May 5, 2026
Merged

ci(e2e): switch to persistent stop/start model#487
DorianZheng merged 1 commit into
mainfrom
ci/persistent-runner

Conversation

@DorianZheng

Copy link
Copy Markdown
Member

Summary

Replace ephemeral create/terminate with persistent stop/start for cost efficiency.

Changes

  • Instance created once, never terminated — build caches persist between runs
  • Workflow starts stopped instance → runs tests → stops it
  • Single global concurrency group (e2e-runner) prevents two jobs using the instance simultaneously
  • Removed user-data provisioning (instance pre-configured with runner as systemd service)
  • Dropped PR label trigger (run on push-to-main + manual only)

Cost improvement

Before (ephemeral) After (persistent)
Per-run compute ~$0.09 (cold build every time) ~$0.03 (cached, incremental)
Monthly storage $0 ~$4 (50GB EBS)
Compile time ~15 min (from scratch) ~2 min (incremental)

Concurrency safety

  • concurrency.group: e2e-runner — single global lock
  • cancel-in-progress: true — newer push cancels older run
  • Only one job can use the instance at any time
  • Stop-runner runs with if: always() even on cancellation

Setup required

A persistent EC2 instance with the runner pre-installed (setup script TBD).
Need to add EC2_E2E_INSTANCE_ID variable.

Test plan

  • Provision persistent instance with runner
  • Set EC2_E2E_INSTANCE_ID variable
  • Trigger workflow manually
  • Verify instance starts, tests run, instance stops
  • Trigger again — verify cached build is faster

@DorianZheng DorianZheng force-pushed the ci/persistent-runner branch 12 times, most recently from bcfafa6 to bba7597 Compare May 5, 2026 14:52
Replace ephemeral create/terminate with persistent stop/start:
- Instance created once, never terminated (build cache persists)
- Workflow starts stopped instance, runs tests, stops it after
- Single global concurrency group prevents conflicts
- ~$0.03/run (cached) vs ~$0.09/run (cold build every time)
- EBS cost: ~$4/mo for 50GB volume

Concurrency safety: `group: e2e-runner` ensures only one job runs
at a time. Newer pushes cancel in-progress runs.

Requires: EC2_E2E_INSTANCE_ID variable (set by setup script)
@DorianZheng DorianZheng force-pushed the ci/persistent-runner branch from bba7597 to 899b4ac Compare May 5, 2026 14:52
@DorianZheng DorianZheng added the e2e-test Triggers E2E integration tests on self-hosted runner label May 5, 2026
@DorianZheng DorianZheng merged commit af0fd27 into main May 5, 2026
10 checks passed
@DorianZheng DorianZheng deleted the ci/persistent-runner branch May 5, 2026 14:56
@DorianZheng DorianZheng restored the ci/persistent-runner branch May 5, 2026 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

e2e-test Triggers E2E integration tests on self-hosted runner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant