Skip to content

Problems with BenchmarkLoadWAL #9101

@bboreham

Description

@bboreham

I looked very closely at what it was doing, and made a list of questionable behaviours.
I expect a benchmark to be somewhat similar to what the real Prometheus application will do.

I am preparing a PR to improve the benchmark, but the list was getting so long I thought I should write them down before I forget.

1. BenchmarkLoadWAL does not simulate existing mmapped chunks.

This was noted at #8645 (comment); it means the benchmark cannot be used to compare optimisations to the code which deals with mmapped chunks.

2. BenchmarkLoadWAL creates new chunks while reading the WAL.

Real Prometheus will not do this, because those chunks would be on disk already.
Because: the code to populate dummy data adds 7200 samples without creating any chunks:

samplesPerSeries: 7200,

while the WAL-reading code has 120 samples per chunk hard-coded:
const samplesPerChunk = 120

and this gets translated into a limit on the end timestamp for the chunk:

prometheus/tsdb/head.go

Lines 2489 to 2493 in a1c1313

if numSamples == samplesPerChunk/4 {
s.nextAt = computeChunkEndTime(c.minTime, c.maxTime, s.nextAt)
}
if t >= s.nextAt {
c = s.cutNewHeadChunk(t, chunkDiskMapper)

(there is also a hard-coded chunk range of 1000 (1 second) which comes into that calculation, but after you fix that the 120 samples still dominates)
opts.ChunkRange = 1000

3. BenchmarkLoadWAL has very uneven sharding.

Because: series references are created as multiples of 100:

Ref: uint64(k) * 100,

Then we take the modulus of this number with gomaxprocs:
n = runtime.GOMAXPROCS(0)

and assign to that shard.

prometheus/tsdb/head.go

Lines 707 to 708 in a1c1313

mod := sam.Ref % uint64(n)
shards[mod] = append(shards[mod], sam)

With 4 cores, we only use 1 shard; with 8 cores only 2, and so on.
Real Prometheus assigns series IDs sequentially, so would tend to spread evenly across the shards.

4. BenchmarkLoadWAL leaks goroutines

Because the WAL is never closed.

...
BenchmarkLoadWAL/batches=10,seriesPerBatch=1000,samplesPerSeries=480,exemplarsPerSeries=24-4                   1         736243863 ns/op
PASS
goleak: Errors on successful test run: found unexpected goroutines:
[Goroutine 8 in state select, with github.com/prometheus/prometheus/tsdb/wal.(*WAL).run on top of the stack:
goroutine 8 [select]:
github.com/prometheus/prometheus/tsdb/wal.(*WAL).run(0xc00025a000)
        /home/vagrant/src/github.com/prometheus/prometheus/tsdb/wal/wal.go:332 +0xbc
created by github.com/prometheus/prometheus/tsdb/wal.NewSize
        /home/vagrant/src/github.com/prometheus/prometheus/tsdb/wal/wal.go:301 +0x325
...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions