Skip to content

kv: kvnemesis can thrash and livelock on multiple concurrent Range merges #46639

@nvb

Description

@nvb

When stressing TestKVNemesisSingleNode with a short timeout (2-5 minutes), it's easy to see the test time out. This appears to be due to thrashing within the Range merge transaction.

The Range merge transaction does not use the standard transaction retry mechanism (i.e. epochs). Instead, it uses a completely separate transaction when restarting due to retry errors. This leaves room for thrashing and livelock if multiple transactions keep stepping on each other's toes.

make roachprod-stress PKG=./pkg/kv/kvnemesis TESTS=TestKVNemesisSingleNode TESTTIMEOUT=2m TESTFLAGS='-v -show-logs' STRESSFLAGS='-stderr -maxfails 1' CLUSTER=<cluster-name>

Metadata

Metadata

Assignees

Labels

A-kv-distributionRelating to rebalancing and leasing.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions