-
Notifications
You must be signed in to change notification settings - Fork 4.1k
release: 21.1 microbenchmark regression investigation and sign-off #62212
Description
This time, split out roughly by team! Of the 568 Go microbenchmarks in the cockroachdb/cockroach repository, a large handful have statistically significant regressions between the release-20.2 branch and the master branch.
We should determine the cause of each of these regressions and decide on whether they are deliberate or accidental and warrant further investigation.
Most of this process was automated with benchdiff. This tool can also be helpful to explore specific regressions here using a command like:
go get github.com/nvanbenschoten/benchdiff
benchdiff --new=master --old=release-20.2 --count=10 --run=<your_benchmark> <your_package>Or, to automate the process of bisecting a regression, the tool can be used with a command like:
export test_name='BenchmarkEntryCacheClearTo'
export pkg_name='./pkg/kv/kvserver/raftentry'
export threshold=0.1
export base_sha=18b6937745c339264314413de61500e58253d9ee # near release-20.2, but easier to compile
# careful with this! It's an aggressive clean.
echo 'git clean -e benchdiff -e post-checkout.sh -dxff' >> post-checkout.sh
echo 'make buildshort' >> post-checkout.sh
git bisect start master ${base_sha}
git bisect run benchdiff --old=${base_sha} --post-checkout='bash post-checkout.sh' --run=${test_name} --count=20 --threshold=${threshold} ${pkg_name}SQL
Packages:
./pkg/bench/...
./pkg/ccl/logictestccl
./pkg/ccl/oidcccl
./pkg/ccl/partitionccl
./pkg/ccl/sqlproxyccl
./pkg/col/...
./pkg/config/...
./pkg/sql/...
./pkg/sqlmigrations/...
Regressions (check to sign off):
-
KV/Update/SQL, @nvanbenschoten, bisected to 5910ab4, test changed, unclear why allocs/op changed, but 👍 -
Distinct/Ordered/..., @yuzefovich, bisected to 76968ff, the benchmark changed slightly (the source incurs more allocations now), yet the throughput has actually improved -
Distinct/Unordered/..., @yuzefovich signed off because the regression is expected and is only on a small subset of cases; bisected on the first try to 76968ff, on the second try to d9f7bcb. The true cause is actually 11e6312. -
TableReader/rows=16, @postamar + @ajwerner, bisected to 15ee387, this one looks real and serious, over a 25% reduction in throughput. After 15ee387 is addressed, there still remains 10% hit or so because of cee3bcf which is discussed elsewhere. @yuzefovich signed off -
Router/MIRROR, not an actual regression, the difference is explained byTAGS=crdb_testbeing added if the benchmark binary built viamake test -
Append/bytes/AppendWithSel/NullProbability=0.2, can't reproduce -
MergeJoiner, not reproducible -
JoinReader/reqOrdering=false@yuzefovich signed off, same asTableReader/rows=16above. -
LogicalProjOp/OR,useSel=true,hasNulls=true, @yuzefovich signed off since this benchmark is known to be a bit flaky. The absolute speed is still very high, not worth digging in. -
RowChannelPipeline/length=4, no regression, same asRouter/MIRRORabove -
SortAll/rows=16, can't reproduce -
AllSpooler, can't reproduce -
SQL/MultinodeCockroach/InterleavedSelect/count=10,@yuzefovich saw 20% hit, out of which 6% or so can be attributed to using the vectorized engine, and the remaining difference likely has the same root cause asthe significant increase in allocs is due to the vectorized engine and is expected, but the usage of vec engine doesn't actually reduce the throughput. Still there is about 6-8% hit when comparing against 20.2. @yuzefovich signed off since it doesn't seem worth looking into.TableReader/rows=16 -
Select2/Cockroach@yuzefovich is investigating. The vectorized engine is responsible for 29% hit in throughput.The perf hit is reduced to 13% by colexec: optimize CASE, AND, and OR projection operators #62925, the remaining difference is caused by the fact the CASE, AND/OR logical operations have higher overhead in the vectorized engine, and this benchmark is basically the worst case for it. @yuzefovich signed off. -
VirtualTableQueries/select_crdb_internal.tables_with_1_fk@rafiss investigated and found no culprit. since this test is new in 21.1 (but got backported to 20.2) it's not that easy to bisect. also, theroundtripsstat did decrease, which is more important since this is I/O bound not CPU/mem bound. -
EndToEnd/tpcc-new-order/EndToEnd@rytaft bisected to sql: decrease vectorize_row_count_threshold to 0 #55713, "decrease vectorize_row_count_threshold to 0". -
IsDatumEqual@rytaft investigated, and this is not a regression. If anything, looks like a small improvement. -
WideTableIgnoreColumns/Cockroach, actually an improvement, likely same asRouter/MIRRORabove -
Scan/Cockroach/count=1000/limit=10@yuzefovich signed off. The 2x or so hit is explained by the switch to the vectorized engine by default for everything, and it is mitigated substantially by sql: hint scan batch size by expected row count #62282. The remaining difference is likely explained bymitigated by colfetcher: use the limit hint to size the batch #63020TableReader/rows=16above. -
Planning@rytaft investigated, at least half of the regression is due to using the vectorized engine. Couldn't pin the remainder of the regression to a single commit. -
VecSkipScan/Bench_scan_with_skip, actually an improvement, likely same asRouter/MIRRORabove -
AlterTableAddCheckConstraint/alter_table_add_3_check_constraints -
NameResolution/MultinodeCockroach@rafiss investigated; bisected to sql: decrease vectorize_row_count_threshold to 0 #55713 (vectorize_row_count_threshold=0). seems expected, and also doesn't matter since the test is supposed to be testing name resolution. -
SQL/Cockroach/InsertFK/count=1000/nFks=1@RaduBerinde bisected to *: configure the use of separated intents in a cluster #59829, more info in there. -
Phases/kv-read-const@mgartner bisected to opt: use safer struct initialization pattern #58386, mitigated partially by opt: speed up new init pattern #62767 -
IndexConstraints@mgartner bisected to opt/idxconstraint: recognize stored column expressions #55867 and opt: use safer struct initialization pattern #58386, regression is minimal and unavoidable - many more across
time/op,alloc/op,allocs/op, andspeed.roundtripslooks good though!
Bulk IO
Packages:
./pkg/blobs
./pkg/ccl/backupccl # not run, https://github.com/cockroachdb/cockroach/issues/62083
./pkg/ccl/baseccl
./pkg/ccl/cliccl
./pkg/ccl/importccl # not run, https://github.com/cockroachdb/cockroach/issues/62085
./pkg/jobs/...
Regressions (check to sign off):
None, though the two most important packages here are skipped.
KV
Packages:
./pkg/ccl/kvccl/...
./pkg/ccl/serverccl
./pkg/gossip/...
./pkg/kv/...
./pkg/kv/kvserver/gc # not run, too slow
./pkg/roachpb
./pkg/rpc/...
./pkg/server/...
Regressions (check to sign off):
-
BenchmarkEntryCache: @nvanbenschoten, bisected to 6e4c3ae, fix in raft: disable XXX_NoUnkeyedLiteral, XXX_unrecognized, and XXX_sizecache fields in protos etcd-io/etcd#12790 -
BenchmarkEntryCacheClearTo: @nvanbenschoten, bisected to 6e4c3ae, fix in raft: disable XXX_NoUnkeyedLiteral, XXX_unrecognized, and XXX_sizecache fields in protos etcd-io/etcd#12790 -
SystemConfigDeltaFilter: @nvanbenschoten, bisected to 836f39c,ZoneConfigmemory size increased, 👍
Storage
Packages:
./pkg/ccl/storageccl/...
./pkg/storage/...
Regressions (check to sign off):
-
PebbleMapIteration -
ScanDecodeKeyValue/getTs=true: @nvanbenschoten, bisected to 11abdda, 1ns difference due to new switch case inDecodeKey, 👍
Misc
Packages:
./pkg/base
./pkg/ccl/utilccl/...
./pkg/ccl/workloadccl/...
./pkg/cli/...
./pkg/clusterversion
./pkg/cmd/...
./pkg/internal/...
./pkg/security
./pkg/settings
./pkg/testutils/...
./pkg/ts/...
./pkg/util/...
./pkg/workload/...
Regressions (check to sign off):
-
BenchmarkHeader: @knz, bisected to a74b36a -
BTreeDeleteInsertCloneEachTime/count=65536/reset=false: @nvanbenschoten, broken benchmark, fixed in util/interval: fix BenchmarkBTreeDeleteInsertCloneEachTime #62294 -
ParseTimestampComparison/2003-06-12_04:05:06.789_America/New_York/ParseInLocation: @otan, bisected to 90efaaf