Skip to content

roachtest: stability problems during tpcc-{5,10,20}k on {6,12,24} nodes #31409

@awoods187

Description

@awoods187

Describe the problem

Dead node with 9k underreplicated ranges when running tpcc

To Reproduce

Modified to use roachtest from 2 days ago:

Modified test to use partitioning and 6 nodes:

@@ -675,11 +675,12 @@ func registerTPCCBench(r *registry) {
                        // StoreDirVersion: "2.0-5",
                },
                {
-                       Nodes: 3,
+                       Nodes: 6,
                        CPUs:  16,
 
-                       LoadWarehouses: 2000,
-                       EstimatedMax:   1300,
+                       LoadWarehouses: 5000,
+                       EstimatedMax:   3000,
+                       LoadConfig:     singlePartitionedLoadgen,
                },

Ran:

bin/roachtest bench '^tpccbench/nodes=6/cpu=16/partition$$' --wipe=false --user=andy

Expected behavior
No dead nodes

Additional data / screenshots
image

Environment:

  • CockroachDB version 2.1 Beta 1008

Dead node logs:
cockroach.log

Metadata

Metadata

Assignees

Labels

C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.C-investigationFurther steps needed to qualify. C-label will change.S-1-stabilitySevere stability issues that can be fixed by upgrading, but usually don’t resolve by restarting

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions