Skip to content

sql, kv: support non-default key locking with leaf txns #94290

@Nican

Description

@Nican

With #94399 we no longer choose to use the leaf txn (which would allow us to use the streamer API and parallelize local scans in some cases) if we find non-default key locking strength. This stems from the fact that we don't propagate lock spans for leaf txns, and even if we did, we'd run into another complication (quoting Nathan):

The other complication that we would have run into with this approach is that only the root txn coordinator runs a heartbeat loop to keep its locks alive. If we only acquired locks in leaf txns, we would never have started the txn heartbeat and the txn could have been aborted after 5s. We could have launched the heartbeat when merging the LeafTxnFinalState, but that still only works if we call UpdateRootWithLeafFinalState within 5s of the initial lock being acquired.

I think this again hints at our implementation of explicit row-level locking and pushing that locking directly into scans (in all cases) being a little funky (see #75457 and #57031). If we had a separate RowLock execution node that lived above a scan/filter/limit/etc. then we could run that using the RootTxn and distribute the rest of the query as much as we'd like. We're going to have to do part of that work for read-committed.

We should figure out how to lift this restriction.

Original issue description

Edit from @jordanlewis: See #94290 (comment) for a trivial repro

To work around this problem in 22.2.0 and 22.2.1, run the following

SET CLUSTER SETTING sql.distsql.use_streamer.enabled = false;

Describe the problem

It looks like v22.2.1 has some regression with FOR UPDATE. It looks like that a lock is being held on the row for 5 seconds after the transaction finishes. Several of my e2e tests are now failing due to timeout. I drilled into one the tests.

Doing explain analyze on the select query after the transaction finishes shows cumulative time spent in KV: 4.9s and cumulative time spent due to contention: 4.9s. Find attached also a debug zip from explain ANALYZE(debug) from the query.

Removing for update fixes the issue, OR reverting to v22.1.8 fixes the issue.

To Reproduce

The test on my codebase fails pretty consistently. From what I can tell, the sequence of events:

  1. Transaction starts.
  2. Transaction runs select * from tbl where id = 1 for update
  3. The row in tbl is NOT updated inside of the transaction, but other data in unrelated tables is updated/inserted.
  4. Transaction completes successfully.
  5. (only run after transaction is complete) select * from tbl now takes 5 seconds to run.

Expected behavior
Running SELECT * FROM tbl; when there is no other workload on the database to take a few milliseconds to run.

Additional data / screenshots
stmt-bundle-825387661464305665.zip

Environment:

  • CockroachDB version v22.2.1
  • Server OS: Linux on docker
  • Client app: Sequelize

Additional context
E2E (including clicking on buttons) Tests are failing due to timeout.

Jira issue: CRDB-22800

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.O-communityOriginated from the communityT-kvKV TeamT-sql-queriesSQL Queries TeamX-blathers-triagedblathers was able to find an ownerbranch-release-22.2Used to mark GA and release blockers, technical advisories, and bugs for 22.2

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions