ci: fix TransientTransactionError flakes#16818
Merged
Merged
Conversation
The CI MongoDB replica set runs with the default maxTransactionLockRequestTimeoutMillis of 5ms. Under CPU contention a transient lock hold during seed/onInit (e.g. concurrent version writes) exceeds 5ms, so the transaction fails with a TransientTransactionError (LockTimeout, code 24) instead of waiting. That surfaces as payloadInitError and crashes onInit, which fails the entire suite (every test and retry times out against a dead server). Raise the timeout to 100ms on both long-running mongod starts to absorb transient contention while still failing fast on genuine deadlocks.
Contributor
📦 esbuild Bundle Analysis for payloadThis analysis was generated by esbuild-bundle-analyzer. 🤖
Largest pathsThese visualization shows top 20 largest paths in the bundle.Meta file: packages/next/meta_index.json, Out file: esbuild/index.js
Meta file: packages/payload/meta_index.json, Out file: esbuild/index.js
Meta file: packages/payload/meta_shared.json, Out file: esbuild/exports/shared.js
Meta file: packages/richtext-lexical/meta_client.json, Out file: esbuild/exports/client_optimized/index.js
Meta file: packages/ui/meta_client.json, Out file: esbuild/exports/client_optimized/index.js
Meta file: packages/ui/meta_shared.json, Out file: esbuild/exports/shared_optimized/index.js
DetailsNext to the size is how much the size has increased or decreased compared with the base branch of this PR.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
The CI MongoDB replica set runs with the default
maxTransactionLockRequestTimeoutMillisof 5ms. Under CPU contention, a transient lock hold during seed/onInit(for example, concurrent version writes) pushes a transaction past 5ms, so it fails with aTransientTransactionError(LockTimeout, code 24) instead of waiting. That crashesonInitand takes down the whole E2E suite: every test and retry then times out against a dead server.This raises the lock timeout so transient contention is absorbed.
Key Changes
--setParameter maxTransactionLockRequestTimeoutMillis=100to both long-runningmongodstarts in the MongoDB docker-compose entrypoint (the already-initialized path and the first-run start-with-auth path).mongodused only to initialize the replica set and users is left unchanged, since it is killed before the application connects and runs no application transactions.Design Decisions
TransientTransactionErrorin the seed/adapter layer is the alternative, but it carries more risk and changes runtime behavior for all consumers.LockTimeout-at-onInitsignature over future runs, so confidence accrues over time rather than from a single red-to-green flip.Overall Flow
sequenceDiagram participant Init as Payload onInit (seed) participant Mongo as MongoDB (CI replica set) Note over Init,Mongo: Before Init->>Mongo: insertOne into _versions (in transaction) Mongo-->>Init: IX lock busy, give up after 5ms (TransientTransactionError) Init--xInit: onInit crashes, entire suite fails Note over Init,Mongo: After Init->>Mongo: insertOne into _versions (in transaction) Mongo-->>Init: lock acquired within 100ms window Init->>Init: seed completes, suite runsReferences / Links
maxTransactionLockRequestTimeoutMillis