Skip to content

storage: Use optimized SeekGE in CheckSSTConflicts#73981

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
itsbilal:revert-prefix-iterate
Jan 12, 2022
Merged

storage: Use optimized SeekGE in CheckSSTConflicts#73981
craig[bot] merged 2 commits intocockroachdb:masterfrom
itsbilal:revert-prefix-iterate

Conversation

@itsbilal
Copy link
Copy Markdown
Contributor

@itsbilal itsbilal commented Dec 17, 2021

This nearly reverts #73514 by moving back to calling SeekGE on
the engine to skip past any empty spans on either the engine
or the SSTable. This is the more optimal approach on average,
and given optimizations in cockroachdb/pebble#1412 which this
change depends on, it also ends up performing better than
a SeekPrefixGE-driven appraoch and the pre-#73514 approach.

Improvement when running BenchmarkCheckSSTConflicts
against the pre-#73514 revision (vs. this one):

name                                                                      old time/op  new time/op  delta
CheckSSTConflicts/keys=1000/versions=8/sstKeys=1000/overlap=false-24      72.6µs ±11%  66.3µs ± 1%   -8.67%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=8/sstKeys=1000/overlap=true-24       12.2ms ± 1%   1.7ms ± 1%  -86.41%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=8/sstKeys=10000/overlap=false-24     69.8µs ± 2%  67.4µs ± 1%   -3.48%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=8/sstKeys=10000/overlap=true-24      13.3ms ± 3%   2.8ms ± 1%  -78.97%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=1000/overlap=false-24     75.8µs ± 3%  63.8µs ± 1%  -15.86%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=1000/overlap=true-24      13.0ms ± 1%   1.9ms ± 1%  -85.11%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=10000/overlap=false-24    69.8µs ±11%  64.6µs ± 1%   -7.45%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=10000/overlap=true-24     14.8ms ± 9%   3.1ms ± 2%  -79.05%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=1000/overlap=false-24     66.1µs ± 2%  63.7µs ± 1%   -3.65%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=1000/overlap=true-24      14.2ms ± 9%   1.9ms ± 1%  -86.55%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=10000/overlap=false-24    72.3µs ±10%  64.5µs ± 0%  -10.77%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=10000/overlap=true-24      122ms ± 2%    17ms ± 1%  -86.03%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=1000/overlap=false-24    69.0µs ± 9%  62.4µs ± 1%   -9.57%  (p=0.032 n=5+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=1000/overlap=true-24     14.0ms ± 1%   2.3ms ± 2%  -83.46%  (p=0.016 n=4+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=10000/overlap=false-24   69.4µs ± 9%  62.7µs ± 1%   -9.63%  (p=0.016 n=5+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=10000/overlap=true-24     140ms ± 5%    26ms ± 1%  -81.70%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=1000/overlap=false-24    69.2µs ±10%  62.5µs ± 1%   -9.66%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=1000/overlap=true-24     15.3ms ±11%   2.3ms ± 1%  -85.21%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=10000/overlap=false-24   69.7µs ±12%  63.6µs ± 1%     ~     (p=0.095 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=10000/overlap=true-24     148ms ± 6%    28ms ± 2%  -80.90%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=1000/overlap=false-24   67.1µs ±10%  61.1µs ± 2%   -8.93%  (p=0.016 n=5+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=1000/overlap=true-24    14.4ms ± 2%   2.5ms ± 5%  -82.45%  (p=0.016 n=4+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=10000/overlap=false-24  68.9µs ±21%  62.2µs ± 1%   -9.76%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=10000/overlap=true-24    204ms ±14%    42ms ± 5%  -79.44%  (p=0.008 n=5+5)

Fixes #66410.

Release note: None.

@itsbilal itsbilal requested review from dt and sumeerbhola December 17, 2021 17:27
@itsbilal itsbilal requested a review from a team as a code owner December 17, 2021 17:27
@itsbilal itsbilal self-assigned this Dec 17, 2021
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@itsbilal itsbilal added the do-not-merge bors won't merge a PR with this label. label Dec 17, 2021
@itsbilal
Copy link
Copy Markdown
Contributor Author

This can't be merged until cockroachdb/pebble#1412 lands, and the vendor ref is bumped in Cockroach.

@erikgrinaker
Copy link
Copy Markdown
Contributor

I ran a single TPCE import benchmark with and without this patch. For posterity, the import workload is:

CREATE TABLE trade (
    t_id          INT8 NOT NULL PRIMARY KEY,
    t_dts         TIMESTAMP NOT NULL,
    t_st_id       VARCHAR(4) NOT NULL,
    t_tt_id       VARCHAR(3) NOT NULL,
    t_is_cash     BOOL NOT NULL,
    t_s_symb      VARCHAR(15) NOT NULL,
    t_qty         INT4 NOT NULL CHECK (t_qty > 0),
    t_bid_price   DECIMAL(8,2) NOT NULL CHECK (t_bid_price > 0),
    t_ca_id       INT8 NOT NULL,
    t_exec_name   VARCHAR(49) NOT NULL,
    t_trade_price DECIMAL(8,2),
    t_chrg        DECIMAL(10,2) NOT NULL CHECK (t_chrg >= 0),
    t_comm        DECIMAL(10,2) NOT NULL CHECK (t_comm >= 0),
    t_tax         DECIMAL(10,2) NOT NULL CHECK (t_tax  >= 0),
    t_lifo        BOOL NOT NULL,
    FAMILY (t_id),
    FAMILY (t_comm, t_dts, t_st_id, t_trade_price, t_exec_name, t_tt_id, t_is_cash, t_s_symb, t_qty, t_bid_price, t_ca_id, t_chrg, t_lifo),
    FAMILY (t_tax)
);
CREATE INDEX ON trade (t_ca_id, t_dts DESC) STORING (t_st_id, t_tt_id, t_is_cash, t_s_symb, t_qty, t_bid_price, t_exec_name, t_trade_price, t_chrg);
CREATE INDEX ON trade (t_s_symb, t_dts ASC) STORING (t_ca_id, t_exec_name, t_is_cash, t_trade_price, t_qty, t_tt_id);
IMPORT INTO trade CSV DATA ('gs://cockroach-fixtures/tpce-csv/customers=100000/*/Trade.txt?AUTH=implicit') WITH delimiter = '|', DETACHED;

This took 4h43m46s without this patch, and 3h54m37s with it -- a pretty significant reduction of 17.3%. A few relevant graphs, where the first import started at 21:40 and the second one at 08:55, show that we're getting better resource utilization both in terms of CPU and IO:

Screenshot 2021-12-20 at 13 55 15

Screenshot 2021-12-20 at 13 55 25

Screenshot 2021-12-20 at 13 55 33

Screenshot 2021-12-20 at 13 56 14

I'll do another run and see if the results are consistent.

@erikgrinaker
Copy link
Copy Markdown
Contributor

Second run was consistent: 4h55m23s vs 4h11m23s, so a 14.9% reduction. Nice work!

@itsbilal itsbilal force-pushed the revert-prefix-iterate branch from d6e9e0d to 4c6df3e Compare January 12, 2022 17:33
@itsbilal
Copy link
Copy Markdown
Contributor Author

First commit is #74745 , which this change depends on, but otherwise this PR is ready for review.

@itsbilal itsbilal force-pushed the revert-prefix-iterate branch from 4c6df3e to 095addc Compare January 12, 2022 17:46
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @itsbilal)


pkg/storage/sst_iterator.go, line 95 at r2 (raw file):

	trySeekUsingNext := false
	if r.seekGELastOp {
		// trySeekUsingNext = r.prevSeekKey <= key

leftover code?


pkg/storage/sst_iterator.go, line 97 at r2 (raw file):

		// trySeekUsingNext = r.prevSeekKey <= key
		trySeekUsingNext = !key.Less(r.prevSeekKey)
	}

maybe add a comment like
// NB: seekGELastOp may still be true, and we haven't updated prevSeekKey. So be careful not to return before the end of the function that sets these fields up for the next SeekGE.

Changes pulled in:

```
3d0ff924d13a3d5fdf6e56a391c5c178c18ff196 *: Add trySeekUsingNext optimization to SeekGE
0c503048eb0365981929177c30178add8a56ae3e sstable: add (*sstable.Writer).RangeKey{Set,Unset,Delete} methods
fe52b49cc28df62dce9b00c382a5ce217936be56 tool/logs: aggregate compaction logs by node and store ID
8ab4358bc59dfa62e5e34e4b0e5ce81a68f5fe91 sstable: return err from `(BlockPropertyCollector).FinishTable`
91c18ef0ee999980c2869d11e5ce468410acbe8d internal/keyspan: add FragmentIterator interface
953fdb078ff0585489206ae96e1d80ca9f6f90c7 internal/keyspan: implement SetBounds on Iter
aa376a819bf67cd6766ee827feed4bf0bd508f1f tool: add compaction log event aggregation tool
```

Release note: None.
This nearly reverts cockroachdb#73514 by moving back to calling SeekGE on
the engine to skip past any empty spans on either the engine
or the SSTable. This is the more optimal approach on average,
and given optimizations in cockroachdb/pebble#1412 which this
change depends on, it also ends up performing better than
a SeekPrefixGE-driven appraoch and the pre-cockroachdb#73514 approach.

Improvement when running BenchmarkCheckSSTConflicts
against the pre-cockroachdb#73514 revision (vs. this one):

```
name                                                                      old time/op  new time/op  delta
CheckSSTConflicts/keys=1000/versions=8/sstKeys=1000/overlap=false-24      72.6µs ±11%  66.3µs ± 1%   -8.67%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=8/sstKeys=1000/overlap=true-24       12.2ms ± 1%   1.7ms ± 1%  -86.41%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=8/sstKeys=10000/overlap=false-24     69.8µs ± 2%  67.4µs ± 1%   -3.48%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=8/sstKeys=10000/overlap=true-24      13.3ms ± 3%   2.8ms ± 1%  -78.97%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=1000/overlap=false-24     75.8µs ± 3%  63.8µs ± 1%  -15.86%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=1000/overlap=true-24      13.0ms ± 1%   1.9ms ± 1%  -85.11%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=10000/overlap=false-24    69.8µs ±11%  64.6µs ± 1%   -7.45%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=1000/versions=64/sstKeys=10000/overlap=true-24     14.8ms ± 9%   3.1ms ± 2%  -79.05%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=1000/overlap=false-24     66.1µs ± 2%  63.7µs ± 1%   -3.65%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=1000/overlap=true-24      14.2ms ± 9%   1.9ms ± 1%  -86.55%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=10000/overlap=false-24    72.3µs ±10%  64.5µs ± 0%  -10.77%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=8/sstKeys=10000/overlap=true-24      122ms ± 2%    17ms ± 1%  -86.03%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=1000/overlap=false-24    69.0µs ± 9%  62.4µs ± 1%   -9.57%  (p=0.032 n=5+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=1000/overlap=true-24     14.0ms ± 1%   2.3ms ± 2%  -83.46%  (p=0.016 n=4+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=10000/overlap=false-24   69.4µs ± 9%  62.7µs ± 1%   -9.63%  (p=0.016 n=5+5)
CheckSSTConflicts/keys=10000/versions=64/sstKeys=10000/overlap=true-24     140ms ± 5%    26ms ± 1%  -81.70%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=1000/overlap=false-24    69.2µs ±10%  62.5µs ± 1%   -9.66%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=1000/overlap=true-24     15.3ms ±11%   2.3ms ± 1%  -85.21%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=10000/overlap=false-24   69.7µs ±12%  63.6µs ± 1%     ~     (p=0.095 n=5+5)
CheckSSTConflicts/keys=100000/versions=8/sstKeys=10000/overlap=true-24     148ms ± 6%    28ms ± 2%  -80.90%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=1000/overlap=false-24   67.1µs ±10%  61.1µs ± 2%   -8.93%  (p=0.016 n=5+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=1000/overlap=true-24    14.4ms ± 2%   2.5ms ± 5%  -82.45%  (p=0.016 n=4+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=10000/overlap=false-24  68.9µs ±21%  62.2µs ± 1%   -9.76%  (p=0.008 n=5+5)
CheckSSTConflicts/keys=100000/versions=64/sstKeys=10000/overlap=true-24    204ms ±14%    42ms ± 5%  -79.44%  (p=0.008 n=5+5)
```

Fixes cockroachdb#66410.

Release note: None.
@itsbilal itsbilal force-pushed the revert-prefix-iterate branch from 095addc to 65cc09e Compare January 12, 2022 19:28
Copy link
Copy Markdown
Contributor Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @sumeerbhola)


pkg/storage/sst_iterator.go, line 95 at r2 (raw file):

Previously, sumeerbhola wrote…

leftover code?

Nope, it's just writing the below line in mathematical form, as the code form is a little weird with the !Less.


pkg/storage/sst_iterator.go, line 97 at r2 (raw file):

Previously, sumeerbhola wrote…

maybe add a comment like
// NB: seekGELastOp may still be true, and we haven't updated prevSeekKey. So be careful not to return before the end of the function that sets these fields up for the next SeekGE.

Done.

@itsbilal itsbilal removed the do-not-merge bors won't merge a PR with this label. label Jan 12, 2022
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @dt, @itsbilal, and @sumeerbhola)


pkg/storage/sst_iterator.go, line 95 at r2 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Nope, it's just writing the below line in mathematical form, as the code form is a little weird with the !Less.

ah yes, I was too hasty in my reading.

@itsbilal
Copy link
Copy Markdown
Contributor Author

Thanks! Seeing as both this and #74745 are green and approved, I'll merge just this PR to save time.

bors r=sumeerbhola

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 12, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 12, 2022

Build succeeded:

@craig craig bot merged commit f13c81b into cockroachdb:master Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bulk: adaptively probe into existing data in AddSSTableRequests.checkForKeyCollisions

4 participants