-
Notifications
You must be signed in to change notification settings - Fork 4.1k
Treating statements with differently ordered columns as different fingerprints may cause OOMs and other issues with persisted stats #77963
Description
This was raised by this support ticket. Turning off the sql.metrics.statement_details.plan_collection.enabled cluster settings should fix the customer's immediate issue, but this issue describes the underlying issue with our code. Further details including those private to the customer can be found in that support issue, but the gist of the underlying problem is as follows.
We suspect that the OOM was caused by different statement fingerprints being generated for statements with different column orderings. We generate an explain plan for every statement fingerprint, causing explain plan generation to be run for almost every statement. In the customer's database one node had ~17k statement fingerprints eacb with an execution count of 1, that look mostly the same but had different column name orderings in their various clauses of the giant statement.
The customer is using a postgres driver with no ORM, and I suspect that they are perhaps string interpolating an unordered set (as opposed to a deterministically-ordered list) of column names into their statement. Users who write their statements in a similar way would incur a performance cost and degraded persisted stats experience, though after 21.2 they should be less likely to OOM.
One possible way to solve this might be to treat differently ordered columns as the same fingerprint.
Another possible contributor to the OOM might be #77944, though I think that cause is less likely. Some discussion for this issue might be in that issue as well.
Jira issue: CRDB-13875
Epic CRDB-20388