Skip to content

sql: deadlock due to planning query for SHOW TABLE #46447

@ajwerner

Description

@ajwerner

When planning a query we use a different transaction. We do this, at least in part, to ensure that reads of the schema performed during planning are cacheable. See here:

// Preparing needs a transaction because it needs to retrieve db/table
// descriptors for type checking.
// TODO(andrei): Needing a transaction for preparing seems non-sensical, as
// the prepared statement outlives the txn. I hope that it's not used for
// anything other than getting a timestamp.
txn := kv.NewTxn(ctx, ex.server.cfg.DB, ex.server.cfg.NodeID.Get())
ex.statsCollector.reset(&ex.server.sqlStats, ex.appStats, &ex.phaseTimes)
p := &ex.planner
ex.resetPlanner(ctx, p, txn, ex.server.cfg.Clock.PhysicalTime() /* stmtTS */)

Unfortunately, we do use this transaction for more than just a timestamp. This transaction makes its way to name resolution. An example of a problematic case is here:

dataSource, resName, err := d.catalog.ResolveDataSource(d.ctx, flags, &tn)

Imagine that we have Txn1 and Txn2:

Txn1: BEGIN;
Txn1: ALTER TABLE foo ...
Txn2: BEGIN;
Txn2: ALTER TABLE bar ...
Txn2: ALTER TABLE foo ... // blocks on Txn1
Txn1: SHOW COLUMNS FROM bar // blocks on Txn2 in planner txn

Had the planner been using the user's transaction then deadlock detection would have kicked in and we would have been fine. The problem here is that deadlock detection has no way to associate the planner txn with Txn1.

The fix for the short term is likely to be to follow the pattern of #46384 and #46170 and make the transaction used here high priority. Unlike in those cases however, the result of this query is not necesarily going to be cached as a shared resource. In fact, if we do this, then there's a chance that we can introduce a live-lock scenario where concurrent SHOW COLUMN ... can hold off a schema change. I'm not certain that that's the worst thing. I think for now it's the right thing to do.

A better solution might be to explicitly relate the planner txn to Tx1 for deadlock detection. Perhaps something like a read-only child transaction which carries its parent in push requests. @nvanbenschoten I think you've been having thoughts in this area.


This is fallout from #46402.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions