On rho, upgrading from 5bc7bf1 to 7915024 (race-enabled builds) caused performance to plummet, from ~130 inserts per second to ~6. Raft elections are twice as frequent as before (about 10 MsgVote and 10 MsgVoteResp messages per second), and msgProp is 5 times more common (~35 vs ~7).
Perhaps the biggest problem is cmdQMu contention. Messages like this are not uncommon:
cockroach@104.196.147.189: W161102 09:06:12.118239 147887 storage/replica.go:551 [n3,s3,r5973/2:/System/tsd/cr.node.txn.resta…] cmdQMu: mutex held by github.com/cockroachdb/cockroach/pkg/storage.(*Replica).beginCmds for 982.044622ms (>500ms):
and the 95th percentile mutex duration is 900µs, up from 20µs (other mutex metrics haven't changed significantly).
On rho, upgrading from 5bc7bf1 to 7915024 (race-enabled builds) caused performance to plummet, from ~130 inserts per second to ~6. Raft elections are twice as frequent as before (about 10 MsgVote and 10 MsgVoteResp messages per second), and msgProp is 5 times more common (~35 vs ~7).
Perhaps the biggest problem is
cmdQMucontention. Messages like this are not uncommon:and the 95th percentile mutex duration is 900µs, up from 20µs (other mutex metrics haven't changed significantly).