Skip to content

sql: hazardous use of descriptors in BIND outside of explicit transaction #64140

@ajwerner

Description

@ajwerner

Describe the problem

In SQL you can bind arguments to a prepared statement outside of a transaction. This generally happens in the binary wire protocol if I understand correctly. The client will prepare a statement and then it will bind the arguments to an unnamed portal and then it will execute the portal. The prepare is safe and fine because we'll re-acquire leases inside the implicit transaction when checking if the plan is stale. The bind, however, has a hazardous and unsafe properties.

When the type of argument to a bind is either a user-defined type or refers to an OID type by name, we resolve the type and value during while doing the BIND. This is problematic because it will use the transaction currently sitting on the connExecutor. This is generally the previously committed transaction (I suspect there may be cases where this is nil but I don't know). This happens here (with the regclass/regtype resolution happening underneath tree.ParseDOid):

typ, ok := types.OidToType[t]
if !ok {
var err error
typ, err = ex.planner.ResolveTypeByOID(ctx, t)
if err != nil {
return nil, err
}
}
d, err := pgwirebase.DecodeDatum(
ex.planner.EvalContext(),
typ,
qArgFormatCodes[i],
arg,
)

At that point we'll attach a deadline due to leasing to an already committed transaction. If we attach the following sane invariant to the deadline code we get failures right and left:

--- a/pkg/kv/txn.go
+++ b/pkg/kv/txn.go
@@ -695,6 +695,9 @@ func (txn *Txn) UpdateDeadlineMaybe(ctx context.Context, deadline hlc.Timestamp)
 
        txn.mu.Lock()
        defer txn.mu.Unlock()
+       if txn.status().IsFinalized() {
+               panic(errors.WithContextTags(errors.AssertionFailedf("UpdateDeadlineMaybe() called on finalized txn"), ctx))
+       }

Worse yet are cases where we cannot use a cached, leased descriptor, like a number of system tables. For example, the following test fails rather surprisingly with this error:

func TestRegclassBind(t *testing.T) {
	defer leaktest.AfterTest(t)()
	defer log.Scope(t).Close(t)

	ctx := context.Background()
	tc := testcluster.StartTestCluster(t, 1, base.TestClusterArgs{})
	defer tc.Stopper().Stop(ctx)
	_, err := tc.ServerConn(0).Exec("SELECT $1::REGCLASS::INT", "system.descriptor")
	require.NoError(t, err)
}
    system_table_test.go:48: 
        	Error Trace:	system_table_test.go:48
        	Error:      	Received unexpected error:
        	            	pq: error in argument for $1: TransactionStatusError: client already committed or rolled back the transaction. Trying to execute: 1 Get (REASON_UNKNOWN)
        	Test:       	TestRegclassBind

Possible Solution

It seems to me like we should move the conn executor into an open state when binding (or when creating a portal?). It feels weird to me to have a bound portal outside of a transaction. I may not know enough to know though. We do this already for example when processing an ExecCmd.

Additional context

There are some edge cases where I think we may miss renames or interpret arguments as the wrong type. It's more just a general soundness problem.

Jira issue: CRDB-6916

Metadata

Metadata

Assignees

Labels

A-sql-executorSQL txn logicC-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-sql-foundationsSQL Foundations Team (formerly SQL Schema + SQL Sessions)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions