Skip to content

roachtest: add read committed variants of ycsb#107517

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/ycsbIsoLevel
Jul 26, 2023
Merged

roachtest: add read committed variants of ycsb#107517
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/ycsbIsoLevel

Conversation

@nvb
Copy link
Copy Markdown
Contributor

@nvb nvb commented Jul 25, 2023

Closes #107112.

This PR adds the following six roachtest variants:

ycsb/A/nodes=3/cpu=32/isolation-level=read-committed
ycsb/B/nodes=3/cpu=32/isolation-level=read-committed
ycsb/C/nodes=3/cpu=32/isolation-level=read-committed
ycsb/D/nodes=3/cpu=32/isolation-level=read-committed
ycsb/E/nodes=3/cpu=32/isolation-level=read-committed
ycsb/F/nodes=3/cpu=32/isolation-level=read-committed

It does so after adding an --isolation-level flag to ycsb, which controls the isolation level to run the workload transactions under. If unset, the workload will run with the default isolation level of the database.

Release note: None

Informs cockroachdb#107112.

This commit adds an `--isolation-level` flag to ycsb, which controls the
isolation level to run the workload transactions under. If unset, the
workload will run with the default isolation level of the database.

Release note: None
@nvb nvb requested a review from michae2 July 25, 2023 04:43
@nvb nvb requested a review from a team as a code owner July 25, 2023 04:43
@nvb nvb requested review from herkolategan and smg260 and removed request for a team July 25, 2023 04:43
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Jul 25, 2023

I'll kick off some iterations of each workload to compare vs. serializable.

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Jul 25, 2023

Here is the result of the comparison:

                      │ ycsb_ssi.txt │               ycsb_rc.txt               │
                      │    ops/s     │     ops/s       vs base                 │
ycsb/A/nodes=3/cpu=32   30.97k ± ∞ ¹    31.92k ± 6%          ~ (p=0.662 n=5+6)
ycsb/B/nodes=3/cpu=32   82.73k ± ∞ ¹    91.95k ±  ∞ ¹  +11.16% (p=0.016 n=5)
ycsb/C/nodes=3/cpu=32   99.24k ± ∞ ¹   103.40k ± 6%          ~ (p=0.931 n=5+6)
ycsb/D/nodes=3/cpu=32   85.23k ± ∞ ¹    96.24k ± 2%          ~ (p=0.126 n=5+6)
ycsb/E/nodes=3/cpu=32   6.386k ± ∞ ¹    7.080k ± 1%          ~ (p=0.126 n=5+6)
ycsb/F/nodes=3/cpu=32   17.17k ± ∞ ¹    30.24k ± 3%    +76.11% (p=0.004 n=5+6)

                      │ ycsb_ssi.txt │              ycsb_rc.txt              │
                      │   avg(ms)    │   avg(ms)     vs base                 │
ycsb/A/nodes=3/cpu=32    4.600 ± ∞ ¹   4.500 ± 4%          ~ (p=0.771 n=5+6)
ycsb/B/nodes=3/cpu=32    2.300 ± ∞ ¹   2.100 ±  ∞ ¹   -8.70% (p=0.048 n=5)
ycsb/C/nodes=3/cpu=32    1.900 ± ∞ ¹   1.900 ± 5%          ~ (p=0.732 n=5+6)
ycsb/D/nodes=3/cpu=32    1.700 ± ∞ ¹   1.500 ± 0%          ~ (p=0.065 n=5+6)
ycsb/E/nodes=3/cpu=32    22.60 ± ∞ ¹   20.35 ± 1%          ~ (p=0.113 n=5+6)
ycsb/F/nodes=3/cpu=32    8.400 ± ∞ ¹   4.800 ± 4%    -42.86% (n=5+6)

                      │ ycsb_ssi.txt │              ycsb_rc.txt               │
                      │   p99(ms)    │    p99(ms)     vs base                 │
ycsb/A/nodes=3/cpu=32    54.50 ± ∞ ¹   67.10 ±  9%    +23.12% (p=0.017 n=5+6)
ycsb/B/nodes=3/cpu=32    12.60 ± ∞ ¹   10.50 ±   ∞ ¹  -16.67% (p=0.008 n=5)
ycsb/C/nodes=3/cpu=32    8.100 ± ∞ ¹   7.900 ± 13%          ~ (p=0.905 n=5+6)
ycsb/D/nodes=3/cpu=32    6.800 ± ∞ ¹   6.150 ±  2%          ~ (p=0.082 n=5+6)
ycsb/E/nodes=3/cpu=32    67.10 ± ∞ ¹   62.90 ±  3%          ~ (p=0.182 n=5+6)
ycsb/F/nodes=3/cpu=32   130.00 ± ∞ ¹   54.50 ± 12%    -58.08% (p=0.004 n=5+6)

The largest difference is from ycsb F (50% read, 50% read-modify-write). That's likely due to a combination of two factors:

  1. this PR passes --read-modify-write-in-txn=false for read committed. I think I'm going to switch that to be the default even for serializable. We've seen external users of our YCSB benchmark get tripped up on this difference vs. the official benchmark.
  2. reads under RC don't block on exclusive locks. This eliminates blocking between SFU locks in the read-modify-write transaction and non-locking readers.

We also see a meaningful improvement in ycsb A (50% read, 50% update) and ycsb B (95% read, 5% update). I think this is also due to RC reads not blocking on exclusive locks. This eliminates blocking between the implicit SFU locks acquired by the updates and the non-locking readers.

nvb added a commit to nvb/cockroach that referenced this pull request Jul 25, 2023
The `read-modify-write-in-txn` flag was added to ycsb in cockroachdb#103117. This
commit changes the default of the flag from true to false. This makes
the default configuration of ycsb more closely mirror the official ycsb
implementation, to avoid confusion when external users run the workload
and compare.

We saw in cockroachdb#103117 and in cockroachdb#107517 that this improves throughput
substantially. We will expect to see the same in roachperf once this
change is merged. After merging the PR, I will add a roachperf annotation.

Epic: None
Release note: None
@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Jul 25, 2023

I think I'm going to switch that to be the default even for serializable. We've seen external users of our YCSB benchmark get tripped up on this difference vs. the official benchmark.

This change is in #107537.

craig bot pushed a commit that referenced this pull request Jul 26, 2023
105441: server: notify tenant servers of metadata changes r=yuzefovich,stevendanna a=knz

Informs (towards fixing) #98431.
Fixes #105721.
Epic: CRDB-26691

See the individual commits for details.


106325: sql/parser: parse CREATE PROCEDURE r=mgartner a=mgartner

#### sql/parser: parse CREATE PROCEDURE

`CREATE PROCEDURE` statements can now be parsed by the SQL parser.
Currently, executing `CREATE PROCEDURE` will produce an unimplemented
error.

Epic: CRDB-25388

Release note: None

#### sem/tree: rename function AST nodes

This commit renames UDF-related AST nodes with "function" in the name to
use the more general term "routine". The term "routine" is inclusive of
both UDFs and procedures (e.g., Postgres implements a `DROP ROUTINE`
statement which drops both UDFs and procedures), which is fitting
because we'll be using the same AST nodes for both `CREATE FUNCTION` and
`CREATE PROCEDURE` statements.

Release note: None

#### sem/tree: rename udf.go to create_routine.go

Release note: None


107470: server, ccl: adjust more tests to work with test tenant r=yuzefovich a=yuzefovich

Informs: #76378.
Epic: CRDB-18499.


107507:  kvserver: transfer lease when acquiring lease outside preferences r=erikgrinaker,andrewbaptist a=kvoli

First 2 commits are from #107505.

When a leaseholder is lost, any surviving replica may acquire the lease, even if it violates lease preferences. There are two main reasons for this: we need to elect a new Raft leader who will acquire the lease, which is agnostic to lease preferences, and there may not even be any surviving replicas that satisfy the lease preferences at all, so we don't want to keep the range unavailable while we try to figure this out (considering e.g. network timeouts can delay this for many seconds).

However, after acquiring a lease, we rely on the replicate queue to transfer the lease back to a replica that conforms with the preferences, which can take several minutes. In multi-region clusters, this can cause severe latency degradation if the lease is acquired in a remote region.

This PR will detect lease preference violations when a replica acquires a new lease, and eagerly enqueue it in the replicate queue for transfer (if possible). If the first process fails, the replica will be sent to purgatory and retried soon after.

Additionally, two roachtests are added for lease preference conformance timing. The first test, `.../partial-first-preference-down`, takes down one of the preferred locality nodes (1/2). The second test, `.../full-first-preference-down`, takes down both the preferred locality nodes (2/2).

Resolves #106100.
Epic: none

Release note (bug fix): When losing a leaseholder and using lease preferences, the lease can be acquired by any other replica (regardless of lease preferences) in order to restore availability as soon as possible. The new leaseholder will now immediately check if it violates the lease preferences, and attempt to transfer the lease to a replica that satisfies the preferences if possible.


107512: sql: fix crdb_internal.encode_key in some contexts r=yuzefovich a=yuzefovich

Previously, we forgot to set `CatalogBuiltins` for the internal planner which is used by `crdb_internal.encode_key` (which, in turn, is used to power `SHOW RANGE ... FOR ROW`), so evaluating it would lead to an error. For example, the internal planner is used in the backfill. This is now fixed.

Fixes: #106397.

Release note (bug fix): CockroachDB would previously return an error when using `SHOW RANGE ... FOR ROW ...` in `CREATE TABLE ... AS ...` construct, and this is now fixed.

107537: workload/ycsb: default --read-modify-write-in-txn to false r=erikgrinaker a=nvanbenschoten

The `read-modify-write-in-txn` flag was added to ycsb in #103117. This commit changes the default of the flag from true to false. This makes the default configuration of ycsb more closely mirror the official ycsb implementation, to avoid confusion when external users run the workload and compare.

We saw in #103117 and in #107517 that this improves throughput substantially. We will expect to see the same in roachperf once this change is merged. After merging the PR, I will add a roachperf annotation.

Epic: None
Release note: None

107549: persistedsqlstats: add AOST clause to statement statistics size check r=maryliag a=THardy98

Epic: None

This change as an AOST clause to the statement statistics size check.
This helps mitigate contention on the `system.statement_statistics`
table which has caused the sql stats compaction job to fail.

Release note: None

Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Austen McClernon <austen@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Thomas Hardy <thardy@cockroachlabs.com>
Closes cockroachdb#107112.

This commit adds the following six roachtest variants:
```
ycsb/A/nodes=3/cpu=32/isolation-level=read-committed
ycsb/B/nodes=3/cpu=32/isolation-level=read-committed
ycsb/C/nodes=3/cpu=32/isolation-level=read-committed
ycsb/D/nodes=3/cpu=32/isolation-level=read-committed
ycsb/E/nodes=3/cpu=32/isolation-level=read-committed
ycsb/F/nodes=3/cpu=32/isolation-level=read-committed
```

Release note: None
@nvb nvb force-pushed the nvanbenschoten/ycsbIsoLevel branch from 5bea524 to 2a335a2 Compare July 26, 2023 15:00
@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Jul 26, 2023

I updated this PR to reflect the change to the default value of --read-modify-write-in-txn in #107537. @michae2 PTAL when you get a chance.

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

Reviewed 2 of 2 files at r1, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan, @nvanbenschoten, and @smg260)


pkg/workload/connection.go line 123 at r1 (raw file):

	// NOTE: validation of the isolation level value is done by the server during
	// connection establishment.
	return setUrlParam(urls, "default_transaction_isolation", isoLevel)

If isoLevel is snapshot, should this also set sql.txn.snapshot_isolation.enabled?

Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r=michae2

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan, @michae2, and @smg260)


pkg/workload/connection.go line 123 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

If isoLevel is snapshot, should this also set sql.txn.snapshot_isolation.enabled?

If there was a session variable to enable snapshot isolation, I think it would be a good idea to set it here. As is, we only have the cluster setting, and we generally don't like workloads silently manipulating cluster settings, so I'll leave for now. Please let me know if you'd like me to revisit.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jul 26, 2023

Build succeeded:

@craig craig bot merged commit 284b6c0 into cockroachdb:master Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachtest: support ycsb under read committed

3 participants