Skip to content

workload/tpcc: support Read Committed isolation#113834

Merged
craig[bot] merged 4 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/rcTPCC
Nov 15, 2023
Merged

workload/tpcc: support Read Committed isolation#113834
craig[bot] merged 4 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/rcTPCC

Conversation

@nvb
Copy link
Copy Markdown
Contributor

@nvb nvb commented Nov 4, 2023

Closes #100176.

This PR consists of a series of commits which together add support for Read Committed isolation to the TPC-C workload and then use it to add new roachtest variants.

See individual commits, including an interesting change to explicit row-level locking in TPC-C transactions to avoid concurrency anomalies.

Release note: None

@nvb nvb requested a review from michae2 November 4, 2023 20:58
@nvb nvb requested a review from a team as a code owner November 4, 2023 20:58
@nvb nvb requested review from herkolategan and srosenberg and removed request for a team November 4, 2023 20:58
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Nov 5, 2023

I tested out both of the new roachtest variants and both passed, running without error and passing the post-run consistency checks. This was not the case before the commit here that added SELECT FOR UPDATE in two places to avoid data anomalies.

Here are some basic performance results, comparing Serializable and Read Committed on TPC-C:

tpcc-nowait/nodes=3/w=1

(Single warehouse, highly contended, lightly loaded)

Serializable

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0           2571            4.3    141.5    125.8    243.3    302.0    604.0  delivery

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0          25742           42.9    103.4     88.1    209.7    318.8    805.3  newOrder

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0           2574            4.3     66.9     56.6    167.8    226.5    453.0  orderStatus

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0          25738           42.9    101.3     83.9    218.1    318.8    738.2  payment

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0           2576            4.3     76.7     67.1    176.2    268.4    520.1  stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
  600.0s        0          59201           98.7    101.4     83.9    218.1    318.8    805.3

Read Committed

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0           3049            5.1    163.5    151.0    285.2    419.4    704.6  delivery

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0          30490           50.8     81.5     79.7    113.2    134.2    260.0  newOrder

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0           3049            5.1     56.2     54.5     83.9     96.5    251.7  orderStatus

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0          30502           50.8     86.9     83.9    130.0    167.8    285.2  payment

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
  600.0s        0           3048            5.1     64.5     62.9     96.5    117.4    184.5  stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
  600.0s        0          70138          116.9     85.6     83.9    134.2    192.9    704.6

In this contended workload, we see an improvement in throughput across the board with Read Committed. We also see an improvement in p50 and p99 latency across the board, with the exception of the Delivery transaction. This is likely because the Delivery transaction is now running a SELECT FOR UPDATE on each of the 10 districts in a warehouse. For Read Committed, these SELECT FOR UPDATE queries synchronously acquire a replicated exclusive lock, while Serializable acquires an unreplicated exclusive lock.

tpcc/headroom/n4cpu16

(875 warehouses, little contention, ~70% loaded)

Serializable

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0         132065           18.3     77.0     52.4     75.5    142.6  21474.8  delivery

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0        1321721          183.6     52.5     27.3     41.9     92.3  24696.1  newOrder

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0         132225           18.4     10.6      6.6     10.5     17.8  16643.0  orderStatus

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0        1320599          183.4     19.4     15.7     25.2     50.3  19327.4  payment

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0         132027           18.3     23.1     14.7     25.2     35.7  18253.6  stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 7200.0s        0        3038637          422.0     36.1     22.0     48.2     75.5  24696.1

Read Committed

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0         131808           18.3    122.5     75.5    121.6    302.0  23622.3  delivery

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0        1319081          183.2     79.2     31.5     56.6    142.6  22548.6  newOrder

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0         131871           18.3     14.1      6.8     12.1     22.0  13421.8  orderStatus

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0        1318410          183.1     32.5     14.7     28.3     60.8  24696.1  payment

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__total
 7200.0s        0         131731           18.3     26.1     17.8     29.4     41.9  11274.3  stockLevel

_elapsed___errors_____ops(total)___ops/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)__result
 7200.0s        0        3032901          421.2     55.7     25.2     67.1    117.4  24696.1

In this uncontended workload, we see about the same throughput, but higher p50 and p99 latency. For the same reason as above (10 synchronous, replicated lock acquisitions), Delivery is about 50% slower in the p50. We could optimize this by combining the 10 SELECT FOR UPDATE statements into a single statement (we tested this 3 years ago in bccb7e9). NewOrder is also about 4ms slower in the p50 (27.3 -> 31.5), likely due to its two uses of SELECT FOR UPDATE and due to the foreign key checks that it performs during INSERT.


These results broadly match expectations. Read Committed provides more predictable performance under transaction contention. However, in an entirely uncontended workload, its use of replicated row-level locking for foreign key checks and SELECT FOR UPDATE statements comes with a latency cost compared to Serializable's optimism. The Read Committed RFC went into some detail about how we could one day pipeline these lock acquisitions (like we do for writes) to hide the latency penalty, but we're not there yet. In the meantime, workloads should avoid sequential SELECT FOR UPDATE statements, opting instead for a single, multi-row SELECT FOR UPDATE statement.

We should dig into all of this and more. For example, we can use the tpcc-nowait/nodes=3/w=1 test for an analysis of transaction retries.

@nvb nvb force-pushed the nvanbenschoten/rcTPCC branch from cdbce1a to 43eaef5 Compare November 5, 2023 00:34
@nvb nvb requested a review from a team November 5, 2023 00:34
@nvb nvb requested a review from a team as a code owner November 5, 2023 00:34
@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Nov 5, 2023

Another reason why SELECT FOR UPDATE seems to be slower with Read Committed is that its performs unnecessary lookup joins in certain cases. For example, this is what the SELECT FOR UPDATE that the Delivery transaction runs 10 times looks like:

root@localhost:26257/system/tpcc> EXPLAIN SELECT no_o_id
                               -> FROM new_order
                               -> WHERE no_w_id = 2 AND no_d_id = 3
                               -> ORDER BY no_o_id ASC
                               -> LIMIT 1
                               -> FOR UPDATE;
                                        info
-------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • lookup join (semi)
  │ estimated row count: 1
  │ table: new_order@new_order_pkey
  │ equality: (no_w_id, no_d_id, no_o_id) = (no_w_id,no_d_id,no_o_id)
  │ equality cols are key
  │ locking strength: for update
  │ locking durability: guaranteed
  │
  └── • scan
        estimated row count: 1 (<0.01% of the table; stats collected 7 minutes ago)
        table: new_order@new_order_pkey
        spans: [/2/3 - /2/3]
        limit: 1

@michae2 is working on eliminating this redundant lookup join. (do we have an issue for this?)

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work on this! 🚀

It would be interesting to have a variant of pkg/sql/opt/xform/testdata/external/tpcc which shows plans under read committed isolation (they would be different for these two SFU statements).

@michae2 is working on eliminating this redundant lookup join. (do we have an issue for this?)

Issue coming, here is the draft PR: #114401

Reviewed 1 of 1 files at r1, 6 of 6 files at r2, 8 of 8 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan and @srosenberg)

nvb added 4 commits November 14, 2023 22:42
Addresses a TODO.

This query has been supported by the optimizer since at least 526b4e5,
which is when it was rewritten in the optimizer's test suite.

Epic: None
Release note: None
Informs cockroachdb#100176.

This commit adds an `--isolation-level` flag to tpcc, which controls the
isolation level to run the workload transactions under. If unset, the
workload will run with the default isolation level of the database.

Release note: None
Informs cockroachdb#100176.

This commit adds SELECT FOR UPDATE locking in two places to ensure that
the workload avoids anomalies when run under Read Committed isolation.

The first of these is in the NewOrder transaction, when querying the
"stock" table in preparation for updating quantities and order counts
for the items in an order. There are no consistency checks which fail
without this, but the locking is present in benchbase (https://github.com/cmu-db/benchbase/blob/546afa60dae4f8a6b00b84b77c77ff7684e494ad/src/main/java/com/oltpbenchmark/benchmarks/tpcc/procedures/NewOrder.java#L88)
and makes sense to do.

The second of these is in the Delivery transaction, when querying the
"new_order" table to select an order to deliver. The order selected is
processed by the transaction, including updating counters in the
corresponding "customer" row, so it's important to have full isolation.
Without this, consistency checks `3.3.2.10` and `3.3.2.12` (`workload
check tpcc --expensive-checks`) do fail, presumably because a customer's
row is updated twice for a single order.

This use of SELECT FOR UPDATE in the Delivery transaction is an
alternative to a patch like 36709df, which would probably be more
efficient than the approach we have here, but would not exercise the
database in an interesting way. We opt to use SELECT FOR UPDATE.

Release note: None
Closes cockroachdb#100176.

This commit adds the following two roachtest variants:
```
tpcc-nowait/isolation-level=read-committed/nodes=3/w=1
tpcc/headroom/isolation-level=read-committed/n4cpu16
```

It also ensures that the `tpcc-nowait` tests runs the full set of expensive
consistency checks at the end. The "nowait" variant run a more heavily
contended version of tpcc, but with few warehouses, so the checks should
still be fast.

Release note: None
@nvb nvb force-pushed the nvanbenschoten/rcTPCC branch from 43eaef5 to b9bdc43 Compare November 15, 2023 03:47
@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Nov 15, 2023

TFTR!

bors r+

It would be interesting to have a variant of pkg/sql/opt/xform/testdata/external/tpcc which shows plans under read committed isolation (they would be different for these two SFU statements).

Agreed. Wherever there are differences in how components handle isolation levels, it's interesting to split them out in tests.

Issue coming, here is the draft PR: #114401

Thanks! Does the query in #113834 (comment) qualify for the rules added in that PR?

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 15, 2023

Build succeeded:

@craig craig bot merged commit e724061 into cockroachdb:master Nov 15, 2023
@michae2
Copy link
Copy Markdown
Collaborator

michae2 commented Nov 16, 2023

(do we have an issue for this?)

#114566

Agreed. Wherever there are differences in how components handle isolation levels, it's interesting to split them out in tests.

Added this in another commit to the PR. It doesn't show the optimization (which occurs in execbuilder) but does show differences from Serializable.

Does the query in #113834 (comment) qualify for the rules added in that PR?

Yes, both SFU queries qualify:

demo@127.0.0.1:26257/demoapp/tpcc> EXPLAIN SELECT s_quantity, s_ytd, s_order_cnt, s_remote_cnt, s_data, s_dist_05
                                -> FROM stock
                                -> WHERE (s_i_id, s_w_id) IN ((1000, 4), (900, 4), (1100, 4), (1500, 4), (1400, 4))
                                -> ORDER BY s_i_id
                                -> FOR UPDATE;
                                                info
-----------------------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • scan
    estimated row count: 0
    table: stock@stock_pkey
    spans: [/4/900 - /4/900] [/4/1000 - /4/1000] [/4/1100 - /4/1100] [/4/1400 - /4/1400] … (1 more)
    locking strength: for update
    locking durability: guaranteed
(9 rows)

Time: 1ms total (execution 1ms / network 0ms)

demo@127.0.0.1:26257/demoapp/tpcc> EXPLAIN SELECT no_o_id
                                -> FROM new_order
                                -> WHERE no_w_id = 10 AND no_d_id = 100
                                -> ORDER BY no_o_id ASC
                                -> LIMIT 1
                                -> FOR UPDATE;
                info
-------------------------------------
  distribution: local
  vectorized: true

  • scan
    estimated row count: 0
    table: new_order@new_order_pkey
    spans: [/10/100 - /10/100]
    limit: 1
    locking strength: for update
    locking durability: guaranteed
(10 rows)

Time: 2ms total (execution 2ms / network 0ms)

@nvb nvb deleted the nvanbenschoten/rcTPCC branch November 26, 2023 23:42
@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Jan 12, 2024

blathers backport 23.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: support tpc-c under read committed

3 participants