-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: DDL has become excessively sensitive to txn retries #35549
Description
For context, I had a coding hiatus in the last 2-3 weeks and I have more sensitivity to "larger scale" differences in the health of our test suite between, say, a month ago and today.
The truth of the matter is I am seeing a bunch of new test flakes that all point in the direction of SQL DDL suffering from transaction retries in ways that did not exist previously.
The reader of this issue description should understand that we have many, many SQL tests that assume that the allocation of object IDs (table, seq, view, dbs) is deterministic and not subject to retries if there is just 1 client to an entire (multi-node) cluster.
This assumption is currently massively violated.
Some symptoms, which many of you may recognize:
- the allocated IDs are occasionally larger than expected, which indicates the DDL was unexpectedly retried
- the test fails outright with a txn retry error
I will refrain from phrasing an opinion about whether the retries are desirable/acceptable. However I'd like to point out that if we keep the current behavior, we need to audit and rewrite a very large number of tests throughout SQL and this task is thoroughly unwelcome so late in the release cycle.
@bdarnell @petermattis please advise.