Skip to content

storage: SysBytes incorrect #20180

@tbg

Description

@tbg

Found investigating #20159.

It looks that our SysBytes are usually incorrect on each and every range I've seen (and I've pointed this at a few data directories from various clusters):

$ ./cockroach start --insecure --store freshdir
^CTRL+C
$ ./cockroach debug check-store freshdir

checking MVCC stats

r1:/{Min-System/} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 175238 != 512
SysCount: 403 != 9

r2:/System/{-NodeLiveness} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 459
SysCount: 0 != 8

r3:/System/NodeLiveness{-Max} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 490
SysCount: 0 != 8

r4:/System/{NodeLivenessMax-tsd} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 483
SysCount: 0 != 8

r5:/System/ts{d-e} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 461
SysCount: 0 != 8

r6:/{System/tse-Table/SystemConfigSpan/Start} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 658
SysCount: 0 != 11

r7:/Table/{SystemConfigSpan/Start-11} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 622
SysCount: 0 != 10

r8:/Table/1{1-2} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 515
SysCount: 0 != 9

r9:/Table/1{2-3} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 515
SysCount: 0 != 9

r10:/Table/1{3-4} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 515
SysCount: 0 != 9

r11:/Table/1{4-5} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 515
SysCount: 0 != 9

r12:/Table/1{5-6} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 582
SysCount: 0 != 10

r13:/Table/1{6-7} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 582
SysCount: 0 != 10

r14:/Table/1{7-8} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 515
SysCount: 0 != 9

r15:/Table/1{8-9} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 515
SysCount: 0 != 9

r16:/Table/{19-20} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 582
SysCount: 0 != 10

r17:/{Table/20-Max} [(n1,s1):1, next=2]: diff(actual, claimed): SysBytes: 0 != 480
SysCount: 0 != 9

Error: check failed
Failed running "debug"

... or, my corresponding code, #20181 is incorrect.

@cuongdo for triage.

While fixing this, we should incorporate this check into the consistency checker (which already reads all the data required to compute the stats).

Metadata

Metadata

Assignees

Labels

S-1-stabilitySevere stability issues that can be fixed by upgrading, but usually don’t resolve by restarting

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions