Skip to content

sql/catalog/descs: fix perf regression #72740

Merged
craig[bot] merged 3 commits intocockroachdb:masterfrom
ajwerner:ajwerner/fix-perf-regression
Nov 15, 2021
Merged

sql/catalog/descs: fix perf regression #72740
craig[bot] merged 3 commits intocockroachdb:masterfrom
ajwerner:ajwerner/fix-perf-regression

Conversation

@ajwerner
Copy link
Copy Markdown
Contributor

This commit in #71936 had the unfortunate side-effect of allocating and forcing reads on the uncommittedDescriptors set even when we aren't looking for the system database. This has an outsized impact on the performance of the single-node, high-core-count KV runs. Instead of always initializing the system database, just do it when we access it.

name             old ops/s   new ops/s   delta
KV95-throughput  88.6k ± 0%  94.8k ± 1%   +7.00%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.60 ± 0%   1.40 ± 0%  -12.50%  (p=0.008 n=5+5)
KV95-Avg          0.60 ± 0%   0.50 ± 0%  -16.67%  (p=0.008 n=5+5)

The second commit is more speculative and came from looking at a profile where 1.6% of the allocated garbage was due to that NameInfo even though we'll never, ever hit it.

Screen Shot 2021-11-15 at 12 57 31 AM

Fixes #72499

@ajwerner ajwerner requested review from a team and postamar November 15, 2021 05:58
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

```
name             old ops/s   new ops/s   delta
KV95-throughput  88.6k ± 0%  94.8k ± 1%   +7.00%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.60 ± 0%   1.40 ± 0%  -12.50%  (p=0.008 n=5+5)
KV95-Avg          0.60 ± 0%   0.50 ± 0%  -16.67%  (p=0.008 n=5+5)
```

Fixes cockroachdb#72499.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/fix-perf-regression branch from f9eb756 to fdc1c88 Compare November 15, 2021 13:08
@ajwerner
Copy link
Copy Markdown
Contributor Author

With the second commit:

name             old ops/s   new ops/s   delta
KV95-throughput  88.7k ± 0%  95.7k ± 0%   +7.82%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.60 ± 0%   1.40 ± 0%  -12.50%  (p=0.008 n=5+5)
KV95-Avg          0.60 ± 0%   0.50 ± 0%  -16.67%  (p=0.008 n=5+5)

This committ takes two steps to decrease lock contention. Firstly, it makes
the name cache data structure concurrent for readers with a RWLock. This
goes a pretty long way to diminish contention on the acquisition path. The
second change is to reduce the contention footprint when mucking with the
refcounts by doing less work in the common case.

Benchmark `kv95/nodes=1/cpu=32` delta over previous commit:
```
name             old ops/s   new ops/s   delta
KV95-throughput  96.4k ± 0%  99.6k ± 1%  +3.36%  (p=0.008 n=5+5)

name             old ms/s    new ms/s    delta
KV95-P50          1.40 ± 0%   1.40 ± 0%    ~     (all equal)
KV95-Avg          0.50 ± 0%   0.50 ± 0%    ~     (all equal)
```

Release note: None
Copy link
Copy Markdown
Collaborator

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 4 of 4 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar)

Copy link
Copy Markdown

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all seems uncontroversially good to me.

A thought popped to mind after seeing your handling of the system db: this is undoubtedly worth it because the system db descriptor shows up in the uncommitted descriptor set in a great many transactions, but how about extending this treatment to all descriptors which are accessed only for reads? Such as things are, we always stuff them in the uncommitted descriptor set regardless if we want them mutable or not. It seems to me that we could break down the uncommitted layer in the collection into two sub-layers: a "cached immutable" layer and a real "uncommitted" layer. Reading a descriptor as mutable would evict it from the "cached immutable" layer. The latter could be pre-populated with the system db descriptor. Is there anything preventing us from doing this?

@ajwerner
Copy link
Copy Markdown
Contributor Author

Such as things are, we always stuff them in the uncommitted descriptor set regardless if we want them mutable or not. It seems to me that we could break down the uncommitted layer in the collection into two sub-layers: a "cached immutable" layer and a real "uncommitted" layer. Reading a descriptor as mutable would evict it from the "cached immutable" layer. The latter could be pre-populated with the system db descriptor. Is there anything preventing us from doing this?

Nothing preventing us from doing this. Will pull it into a separate PR.

@ajwerner
Copy link
Copy Markdown
Contributor Author

Flaked on #72718.

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 15, 2021

Build succeeded:

@craig craig bot merged commit 47a2e2f into cockroachdb:master Nov 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

roachperf: kv95 and kv0 regression around October 28th

4 participants