sql: fix sql compaction job full scan#108947
sql: fix sql compaction job full scan#108947craig[bot] merged 1 commit intocockroachdb:masterfrom j82w:108936
Conversation
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
The sql-stats-compaction is failing with TransactionRetryError. This is caused by the internal executor uses the zero-values for the settings, rather than the cluster defaults. This causes `SET reorder_joins_limit = 0;` which then causes the `sql-stats-compaction` delete statement to do a full scan. The full scan is causing the query to take a long time causing other queries to conflict with it. Error: `TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict: committed value on key` The fix to use the correct default value instead of 0 is made in #101486. Solution is to change the query to avoid the join and thus the full scan. Fixes: #108936 Release note (sql change): Optimized the sql-stats-compaction job delete query to avoid full scan. This helps avoid the TransactionRetryError which can cause the job to fail.
maryliag
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @yuzefovich)
|
Hi @j82w and @DrewKimball , out of curiosity, is this what's happening?
If so, do we know who is write A? |
Swap 3 and 4, and yes. The write should be new statement statistics which are avoided by the constrained scan, but not by the full scan. |
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed all commit messages.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @j82w)
pkg/sql/sqlstats/persistedsqlstats/compaction_exec.go line 287 at r2 (raw file):
unconstrainedDeleteStmt: ` DELETE FROM system.statement_statistics WHERE (aggregated_ts, fingerprint_id, transaction_fingerprint_id, plan_hash, app_name, node_id) IN (
Any idea why we had this self-join in the first place? On a quick glance the queries are equivalent.
|
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Not sure. It was added in the following PR, but I don't see a reason for it. |
|
@DrewKimball can a SELECT that reads a row while the DELETE transaction is in progress also cause the DELETE to fail with 40001 or is it only if the row is modified? |
|
Previously, j82w (Jake) wrote…
The queries should be identical. We are just avoiding the overhead of the join that is causing a scan. |
|
bors r+ |
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 91f452f to blathers/backport-release-23.1-108947: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. error creating merge commit from 91f452f to blathers/backport-release-23.1.9-rc-108947: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.9-rc failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
There has to be a concurrent update. |
|
blathers backport 22.2 |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 91f452f to blathers/backport-release-22.2-108947: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
The sql-stats-compaction is failing with TransactionRetryError. This is caused by the internal executor uses the zero-values for the settings, rather than the cluster defaults. This causes
SET reorder_joins_limit = 0;which then causes thesql-stats-compactiondelete statement to do a full scan. The full scan is causing the query to take a long time causing other queries to conflict with it.Error:
TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict: committed value on keyThe fix to use the correct default value instead of 0 is made in #101486.
Solution is to change the query to avoid the join and thus the full scan.
Fixes: #108936
Release note (sql change): Optimized the sql-stats-compaction job delete query to avoid full scan. This helps avoid the TransactionRetryError which can cause the job to fail.