-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: hazardous use of descriptors in BIND outside of explicit transaction #64140
Description
Describe the problem
In SQL you can bind arguments to a prepared statement outside of a transaction. This generally happens in the binary wire protocol if I understand correctly. The client will prepare a statement and then it will bind the arguments to an unnamed portal and then it will execute the portal. The prepare is safe and fine because we'll re-acquire leases inside the implicit transaction when checking if the plan is stale. The bind, however, has a hazardous and unsafe properties.
When the type of argument to a bind is either a user-defined type or refers to an OID type by name, we resolve the type and value during while doing the BIND. This is problematic because it will use the transaction currently sitting on the connExecutor. This is generally the previously committed transaction (I suspect there may be cases where this is nil but I don't know). This happens here (with the regclass/regtype resolution happening underneath tree.ParseDOid):
cockroach/pkg/sql/conn_executor_prepare.go
Lines 368 to 381 in d91205f
| typ, ok := types.OidToType[t] | |
| if !ok { | |
| var err error | |
| typ, err = ex.planner.ResolveTypeByOID(ctx, t) | |
| if err != nil { | |
| return nil, err | |
| } | |
| } | |
| d, err := pgwirebase.DecodeDatum( | |
| ex.planner.EvalContext(), | |
| typ, | |
| qArgFormatCodes[i], | |
| arg, | |
| ) |
At that point we'll attach a deadline due to leasing to an already committed transaction. If we attach the following sane invariant to the deadline code we get failures right and left:
--- a/pkg/kv/txn.go
+++ b/pkg/kv/txn.go
@@ -695,6 +695,9 @@ func (txn *Txn) UpdateDeadlineMaybe(ctx context.Context, deadline hlc.Timestamp)
txn.mu.Lock()
defer txn.mu.Unlock()
+ if txn.status().IsFinalized() {
+ panic(errors.WithContextTags(errors.AssertionFailedf("UpdateDeadlineMaybe() called on finalized txn"), ctx))
+ }Worse yet are cases where we cannot use a cached, leased descriptor, like a number of system tables. For example, the following test fails rather surprisingly with this error:
func TestRegclassBind(t *testing.T) {
defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)
ctx := context.Background()
tc := testcluster.StartTestCluster(t, 1, base.TestClusterArgs{})
defer tc.Stopper().Stop(ctx)
_, err := tc.ServerConn(0).Exec("SELECT $1::REGCLASS::INT", "system.descriptor")
require.NoError(t, err)
} system_table_test.go:48:
Error Trace: system_table_test.go:48
Error: Received unexpected error:
pq: error in argument for $1: TransactionStatusError: client already committed or rolled back the transaction. Trying to execute: 1 Get (REASON_UNKNOWN)
Test: TestRegclassBind
Possible Solution
It seems to me like we should move the conn executor into an open state when binding (or when creating a portal?). It feels weird to me to have a bound portal outside of a transaction. I may not know enough to know though. We do this already for example when processing an ExecCmd.
Additional context
There are some edge cases where I think we may miss renames or interpret arguments as the wrong type. It's more just a general soundness problem.
Jira issue: CRDB-6916