Skip to content

perf: investigate slow performance degradation #14108

@petermattis

Description

@petermattis

Running ycsb --concurrency 600 --splits 5000 against denim (a 6-node cluster) shows the following throughput:

screen shot 2017-03-13 at 9 40 07 am

Similarly, latencies slowly climb:

screen shot 2017-03-13 at 9 40 21 am

Pre-splitting the ycsb table into 5000 ranges means that the number of ranges is constant over the lifetime of the test. The most interesting metrics that shows an increase and could account for this performance decline are the disk metrics:

screen shot 2017-03-13 at 9 40 38 am

screen shot 2017-03-13 at 9 40 49 am

screen shot 2017-03-13 at 9 41 20 am

Each node is configured with the default 1/4 physical memory of cache, which in this case is 7GB. Each ycsb write is ~1KB in size. Writing at 2K/sec should generate ~7GB/hour and the graphs show we generated ~31GB when disk reads started. Are reads starting to miss in the cache? That's somewhat surprising given the skewed distribution for reads. Perhaps the system just reached a point where it is doing a significant number of background compactions continuously and those compactions are impacting foreground work.

Metadata

Metadata

Assignees

Labels

C-performancePerf of queries or internals. Solution not expected to change functional behavior.

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions