Skip to content

storage: some benchmarks fail in CI with crash in newPebbleIterator #97061

@RaduBerinde

Description

@RaduBerinde

Some benchmarks in the CRDB repository generate fixtures in the package directory and reuse them if they already exist; these fixtures are .gitignored so they survive branch checkouts.

The reason for this is two-fold: re-running the benchmarks is much faster; but more importantly the results are more consistent since they run on exactly the same data.

There are two big problems:

  • this technique won't work with bazel where the test runs inside a sandbox; currently engineers who rely on these fixtures are using the deprecated Makefile
  • old fixtures remain on CI (I'm assuming from the non-bazel-based CI) where they can cause failures (even when tests run under bazel; see example below). These failures became more frequent recently when we bumped up the minimum supported version on master and are cleaning out deprecated compatibility code.

Examples: engineccl.BenchmarkTimeBoundIterate (currently disabled), batcheval.BenchmarkRefreshRange, many benchmarks in storage.

We need to:

  1. find a short-term fix to prevent these CI failures (could be ignoring these benchmarks for now)
  2. find a way to allow reuse of fixtures locally, even under bazel; this is already tracked under storage: MVCC benchmarks unreliable under Bazel #83599

An example of CI failure is here: https://teamcity.cockroachdb.com/viewLog.html?buildId=8689881&tab=buildResultsDiv&buildTypeId=Cockroach_Ci_Tests_Bench

The panic is "pebble: range keys require at least format major version .." which means that the store format is very old.

F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	GOROOT/src/runtime/panic.go:884 +0x212
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/pebble.(*DB).newIterInternal(0xc000908f00, 0x0, 0x0, 0xc00147edc8)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/db.go:943 +0x4d9
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/pebble.(*DB).NewIter(0x0?, 0xc000ab9980?)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/pebble/external/com_github_cockroachdb_pebble/db.go:1251 +0x25
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/storage.newPebbleIterator({0x5b6ff50, 0xc000908f00}, {0x0, {0xc000ab9980, 0x5, 0x8}, {0xc000ab99a0, 0x8, 0x8}, 0x0, ...}, ...)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/storage/pebble_iterator.go:90 +0xca
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/storage.(*Pebble).NewMVCCIterator(0xc000478480?, 0x0?, {0x0, {0xc000ab9980, 0x5, 0x8}, {0xc000ab99a0, 0x8, 0x8}, 0x0, ...})
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/storage/pebble.go:1273 +0x168
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/storage.NewMVCCIncrementalIterator({0x5bb0b20, 0xc000478480}, {0x2, {0xc000ab9980, 0x5, 0x8}, {0xc000ab99a0, 0x8, 0x8}, {0x5, ...}, ...})
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/storage/mvcc_incremental_iterator.go:212 +0x3a6
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval.refreshRange({0x5bb0b20?, 0xc000478480?}, {{0xc000ab9980, 0x5, 0x8}, {0xc000ab99a0, 0x8, 0x8}}, {0x5, 0x0, ...}, ...)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/cmd_refresh_range.go:75 +0x198
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval.RefreshRange({_, _}, {_, _}, {{0x5c01e30, 0xc0001149f0}, {{0x5, 0x1, 0x0}, 0x0, ...}, ...}, ...)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/cmd_refresh_range.go:58 +0x62f
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval_test.runRefreshRangeBenchmark.func1({0x5b775d0, 0xc000082068}, {0x5c0e510?, 0xc000478480}, {0x5c01e30, 0xc0001149f0}, {0xc000ab9980, 0x5, 0x8}, {0xc000ab99a0, ...}, ...)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval_test/pkg/kv/kvserver/batcheval/cmd_refresh_range_bench_test.go:131 +0x285
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval_test.runRefreshRangeBenchmark(0xc000712240, 0xc001491700?, {{0x5, 0x0, 0x0}, {0x5, 0x1, 0x0}, {0xf4240, 0x40, ...}})
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval_test/pkg/kv/kvserver/batcheval/cmd_refresh_range_bench_test.go:161 +0x2b9
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval_test.BenchmarkRefreshRange.func1.1(0xc000712240?)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval_test/pkg/kv/kvserver/batcheval/cmd_refresh_range_bench_test.go:106 +0xef
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !testing.(*B).runN(0xc000712240, 0x1)
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	GOROOT/src/testing/benchmark.go:193 +0x102
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !testing.(*B).run1.func1()
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	GOROOT/src/testing/benchmark.go:233 +0x59
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !created by testing.(*B).run1
F230213 18:26:27.799711 494 storage/pebble.go:1223 â‹® [T1,s?,pebble] 2 !	GOROOT/src/testing/benchmark.go:226 +0x9c

CC @jbowens

Jira issue: CRDB-24486

Metadata

Metadata

Assignees

Labels

C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.C-test-failureBroken test (automatically or manually discovered).

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions