storage: fix (the semantics of) MVCCStats.GCBytesAge#21407
storage: fix (the semantics of) MVCCStats.GCBytesAge#21407tbg merged 1 commit intocockroachdb:masterfrom
Conversation
|
Reviewed 3 of 3 files at r1. pkg/storage/engine/mvcc.go, line 234 at r1 (raw file):
Remove this before merging? pkg/storage/engine/mvcc.go, line 255 at r1 (raw file):
If there are transactions and intents (which we handle below), there are age-able quantities. Specifically, the range-local copy of the range descriptor has historical revisions that need to be GC'd. Maybe it's small enough to ignore that for the purposes of GCBytesAge, but the comment should indicate that that's what it's doing. pkg/storage/engine/mvcc.go, line 413 at r1 (raw file):
Either delete this or put in a comment about why this disabled code is here. pkg/storage/engine/mvcc.go, line 446 at r1 (raw file):
Nice hyperlink :) pkg/storage/engine/mvcc.go, line 2666 at r1 (raw file):
s/deletionNanos/overwriteNanos/? This could also use a comment about how we're iterating in reverse chronological order, so this tracks the timestamp of a value for use in computing the GC timestamp of its predecessor. pkg/storage/engine/mvcc.go, line 2702 at r1 (raw file):
What's up with this comment? Comments from Reviewable |
|
Reviewed 1 of 3 files at r1. pkg/storage/engine/mvcc.go, line 227 at r1 (raw file):
It doesn't look like we need pkg/storage/engine/mvcc.go, line 405 at r1 (raw file):
Ditto Comments from Reviewable |
|
Review status: 0 of 4 files reviewed at latest revision, 8 unresolved discussions. pkg/storage/engine/mvcc.go, line 227 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We use prevValSize for arithmetic so I wasn't able to turn it into a bool, but we were not using pkg/storage/engine/mvcc.go, line 255 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I clarified that we just don't track them but could. pkg/storage/engine/mvcc.go, line 405 at r1 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. pkg/storage/engine/mvcc.go, line 413 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done, 🤞 that we won't need it any more. pkg/storage/engine/mvcc.go, line 2666 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Added comment. Renamed to pkg/storage/engine/mvcc.go, line 2702 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
I toyed around with using Comments from Reviewable |
|
Review status: 0 of 4 files reviewed at latest revision, 7 unresolved discussions. pkg/storage/engine/mvcc.go, line 234 at r1 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. Comments from Reviewable |
a9aff1f to
e3977cb
Compare
|
Reviewed 4 of 4 files at r2. c-deps/libroach/db.cc, line 2361 at r2 (raw file):
The above line is c-deps/libroach/db.cc, line 2456 at r2 (raw file):
Maybe assert here that Comments from Reviewable |
The semantics for computing GCBytesAge were incorrect and are fixed in this commit. Prior to this commit, a non-live write would accrue GCBytesAge from its own timestamp on. That is, if you wrote two versions of a key at 1s and 2s, then when the older version is replaced (at 2s) it would start out with one second of age (from 1s to 2s). However, the key *really* became non-live at 2s, and should have had an age of zero. By extension, imagine a table with lots of writes all dating back to early 2017, and assume that today (early 2018) all these writes are deleted (i.e. a tombstone placed on top of them). Prior to this commit, each key would immediately get assigned an age of `(early 2018) - early 2017)`, i.e. a very large number. Yet, the GC queue could only garbage collect them after (early 2018) + TTL`, so by default 25 hours after the deletion. We use GCBytesAge to trigger the GC queue, so that would cause the GC queue to run without ever getting to remove anything, for the TTL. This was a big problem bound to be noticed by users. This commit changes the semantics to what the GCQueue (and the layman) expects: 1. when a version is shadowed, it becomes non-live at that point and also starts accruing GCBytesAge from that point on. 2. deletion tombstones are an exception: They accrue age from their own timestamp on. This makes sense because a tombstone can be deleted whenever it's older than the TTL (as opposed to a value, which can only be deleted once it's been *shadowed* for longer than the TTL). This work started out by updating `ComputeStatsGo` to have the desired semantics, fixing up existing tests, and then stress testing `TestMVCCStatsRandomized` with short history lengths to discover failure modes which were then transcribed into small unit tests. When no more such failures were discoverable, the resulting logic in the various incremental MVCCStats update helpers was simplified and documented, and `ComputeStats` updated accordingly. In turn, `TestMVCCStatsBasic` was removed: it was notoriously hard to read and maintain, and does not add any coverage at this point. The recomputation of the stats in existing clusters is addressed in cockroachdb#21345. Fixes cockroachdb#20554. Release note (bug fix): fix a problem that could cause spurious GC activity, in particular after dropping a table.
|
Review status: 3 of 4 files reviewed at latest revision, 4 unresolved discussions. c-deps/libroach/db.cc, line 2361 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Nope, just wanted to write c-deps/libroach/db.cc, line 2456 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. Comments from Reviewable |
The semantics for computing GCBytesAge were incorrect and are fixed in
this commit. Prior to this commit, a non-live write would accrue
GCBytesAge from its own timestamp on. That is, if you wrote two versions
of a key at 1s and 2s, then when the older version is replaced (at 2s)
it would start out with one second of age (from 1s to 2s). However, the
key really became non-live at 2s, and should have had an age of zero.
By extension, imagine a table with lots of writes all dating back to
early 2017, and assume that today (early 2018) all these writes are
deleted (i.e. a tombstone placed on top of them). Prior to this commit,
each key would immediately get assigned an age of
(early 2018) - early 2017), i.e. a very large number. Yet, the GC queue could only garbagecollect them after (early 2018) + TTL`, so by default 25 hours after
the deletion. We use GCBytesAge to trigger the GC queue, so that would
cause the GC queue to run without ever getting to remove anything, for
the TTL. This was a big problem bound to be noticed by users.
This commit changes the semantics to what the GCQueue (and the layman)
expects:
also starts accruing GCBytesAge from that point on.
own timestamp on. This makes sense because a tombstone can be
deleted whenever it's older than the TTL (as opposed to a value,
which can only be deleted once it's been shadowed for longer
than the TTL).
This work started out by updating
ComputeStatsGoto have the desiredsemantics, fixing up existing tests, and then stress testing
TestMVCCStatsRandomizedwith short history lengths to discover failuremodes which were then transcribed into small unit tests. When no more
such failures were discoverable, the resulting logic in the various
incremental MVCCStats update helpers was simplified and documented, and
ComputeStatsupdated accordingly. In turn,TestMVCCStatsBasicwasremoved: it was notoriously hard to read and maintain, and does not add
any coverage at this point.
The recomputation of the stats in existing clusters is addressed in
#21345.
Fixes #20554.