cli: check MVCCStats in debug check-store#20181
Merged
tbg merged 1 commit intocockroachdb:masterfrom Jan 3, 2018
Merged
Conversation
Member
Collaborator
|
I don't see anything obviously wrong with the code, but I'm very suspicious there is something. Don't we check on disk stats vs computed stats elsewhere (e.g. during splits)? Review status: 0 of 1 files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
Member
Author
|
@petermattis no. I hoped that this would be fixed by #21070 but it's still happening. I'm working on making the replica GC queue check this (which I need anyway to trigger repair now that the stats are at least somewhat fixed, and more fixes are planned). |
d33d739 to
1225a59
Compare
Unfortunately, this discovered cockroachdb#20180: The stats essentially never match (or this PR has a bug). Release note: None
1225a59 to
86faeea
Compare
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Dec 29, 2017
RaftTombstoneKey was accidentally made a replicated key when it was first introduced, a problem we first realized existed when it was [included in snapshots]. At the time, we included workarounds to skip this key in various places (snapshot application, consistency checker) but of course we have failed to insert further hacks of the same kind elsewhere since (the one that prompting this PR being the stats recomputation on splits, which I'm looking into as part of cockroachdb#20181 -- unfortunately this commit doesn't seem to pertain to that problem) It feels sloppy that we didn't follow through back then, but luckily the damage appears to be limited; it is likely that the replicated existence of this key results in MVCCStats SysBytes inconsistencies, but as it happens, these stats are [already] [very] [inconsistent]. This commit does a few things: - renames the old tombstone key to `RaftIncorrectLegacyTombstoneKey` - introduces a (correctly unreplicated) `RaftTombstoneKey` - introduces a migration that rewrites all legacy keys early in the node start sequence; as a result, a node running this commit or later will never see any legacy keys, except when they come in through a snapshot - when applying a snapshot, forcibly delete any legacy tombstone contained within. `RaftIncorrectLegacyTombstoneKey` can be purged from the codebase in binaries post v2.0, as at that point all peers have version at least v2.0 (which has this commit and will never send legacy tombstones in snapshots). Since sending legacy keys in snapshots was never intended (and isn't relied upon), none of this needs a cluster version migration. Fixes cockroachdb#12154. Release note: None [included in snapshots]: cockroachdb#12131 [already]: cockroachdb#20554 [very]: cockroachdb#20996 [inconsistent]: cockroachdb#21070
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Dec 29, 2017
RaftTombstoneKey was accidentally made a replicated key when it was first introduced, a problem we first realized existed when it was [included in snapshots]. At the time, we included workarounds to skip this key in various places (snapshot application, consistency checker) but of course we have failed to insert further hacks of the same kind elsewhere since (the one that prompting this PR being the stats recomputation on splits, which I'm looking into as part of cockroachdb#20181 -- unfortunately this commit doesn't seem to pertain to that problem) It feels sloppy that we didn't follow through back then, but luckily the damage appears to be limited; it is likely that the replicated existence of this key results in MVCCStats SysBytes inconsistencies, but as it happens, these stats are [already] [very] [inconsistent]. This commit does a few things: - renames the old tombstone key to `RaftIncorrectLegacyTombstoneKey` - introduces a (correctly unreplicated) `RaftTombstoneKey` - introduces a migration that rewrites all legacy keys early in the node start sequence; as a result, a node running this commit or later will never see any legacy keys, except when they come in through a snapshot - when applying a snapshot, forcibly delete any legacy tombstone contained within. `RaftIncorrectLegacyTombstoneKey` can be purged from the codebase in binaries post v2.0, as at that point all peers have version at least v2.0 (which has this commit and will never send legacy tombstones in snapshots). Since sending legacy keys in snapshots was never intended (and isn't relied upon), none of this needs a cluster version migration. Fixes cockroachdb#12154. Release note: None [included in snapshots]: cockroachdb#12131 [already]: cockroachdb#20554 [very]: cockroachdb#20996 [inconsistent]: cockroachdb#21070
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Dec 29, 2017
RaftTombstoneKey was accidentally made a replicated key when it was first introduced, a problem we first realized existed when it was [included in snapshots]. At the time, we included workarounds to skip this key in various places (snapshot application, consistency checker) but of course we have failed to insert further hacks of the same kind elsewhere since (the one that prompting this PR being the stats recomputation on splits, which I'm looking into as part of cockroachdb#20181 -- unfortunately this commit doesn't seem to pertain to that problem) It feels sloppy that we didn't follow through back then, but luckily the damage appears to be limited; it is likely that the replicated existence of this key results in MVCCStats SysBytes inconsistencies, but as it happens, these stats are [already] [very] [inconsistent]. This commit does a few things: - renames the old tombstone key to `RaftIncorrectLegacyTombstoneKey` - introduces a (correctly unreplicated) `RaftTombstoneKey` - introduces a migration that rewrites all legacy keys early in the node start sequence; as a result, a node running this commit or later will never see any legacy keys, except when they come in through a snapshot - when applying a snapshot, forcibly delete any legacy tombstone contained within. `RaftIncorrectLegacyTombstoneKey` can be purged from the codebase in binaries post v2.0, as at that point all peers have version at least v2.0 (which has this commit and will never send legacy tombstones in snapshots). Since sending legacy keys in snapshots was never intended (and isn't relied upon), none of this needs a cluster version migration. Fixes cockroachdb#12154. Release note: None [included in snapshots]: cockroachdb#12131 [already]: cockroachdb#20554 [very]: cockroachdb#20996 [inconsistent]: cockroachdb#21070
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Dec 30, 2017
RaftTombstoneKey was accidentally made a replicated key when it was first introduced, a problem we first realized existed when it was [included in snapshots]. At the time, we included workarounds to skip this key in various places (snapshot application, consistency checker) but of course we have failed to insert further hacks of the same kind elsewhere since (the one that prompting this PR being the stats recomputation on splits, which I'm looking into as part of cockroachdb#20181 -- unfortunately this commit doesn't seem to pertain to that problem) It feels sloppy that we didn't follow through back then, but luckily the damage appears to be limited; it is likely that the replicated existence of this key results in MVCCStats SysBytes inconsistencies, but as it happens, these stats are [already] [very] [inconsistent]. This commit does a few things: - renames the old tombstone key to `RaftIncorrectLegacyTombstoneKey` - introduces a (correctly unreplicated) `RaftTombstoneKey` - introduces a migration that rewrites all legacy keys early in the node start sequence; as a result, a node running this commit or later will never see any legacy keys, except when they come in through a snapshot - when applying a snapshot, forcibly delete any legacy tombstone contained within. `RaftIncorrectLegacyTombstoneKey` can be purged from the codebase in binaries post v2.0, as at that point all peers have version at least v2.0 (which has this commit and will never send legacy tombstones in snapshots). Since sending legacy keys in snapshots was never intended (and isn't relied upon), none of this needs a cluster version migration. Fixes cockroachdb#12154. Release note: None [included in snapshots]: cockroachdb#12131 [already]: cockroachdb#20554 [very]: cockroachdb#20996 [inconsistent]: cockroachdb#21070
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Jan 3, 2018
RaftTombstoneKey was accidentally made a replicated key when it was first introduced, a problem we first realized existed when it was [included in snapshots]. At the time, we included workarounds to skip this key in various places (snapshot application, consistency checker) but of course we have failed to insert further hacks of the same kind elsewhere since (the one that prompting this PR being the stats recomputation on splits, which I'm looking into as part of cockroachdb#20181 -- unfortunately this commit doesn't seem to pertain to that problem) It feels sloppy that we didn't follow through back then, but luckily the damage appears to be limited; it is likely that the replicated existence of this key results in MVCCStats SysBytes inconsistencies, but as it happens, these stats are [already] [very] [inconsistent]. This commit does a few things: - renames the old tombstone key to `RaftIncorrectLegacyTombstoneKey` - introduces a (correctly unreplicated) `RaftTombstoneKey` - introduces a migration that rewrites all legacy keys, to be activated when the cluster version surpasses the newly introduced version. Until then, both tombstones are checked, but only the old one written. - when applying a snapshot, forcibly delete any legacy tombstone contained within (even before the cluster version is bumped). `RaftIncorrectLegacyTombstoneKey` can be purged from the codebase in binaries post v2.1, as at that point all peers have booted with a version that runs the migration (it's morally true in `v2.0` as well, but it's possible that the cluster version update would interleave with a write of a legacy tombstone). In v2.0, there may be both legacy and old-style tombstones, but they'll never proliferate through snapshots. This implies that if a tombstone is left around from before the `v2.0` migration runs, then it is in a replica that since hasn't been recreated (for if it had, the tombstone would have been wiped), and thus it can't trip up the consistency checker. Thus, post v2.0, the replica consistency checker can stop skipping the legacy tombstone key. Fixes cockroachdb#12154. Release note: None [included in snapshots]: cockroachdb#12131 [already]: cockroachdb#20554 [very]: cockroachdb#20996 [inconsistent]: cockroachdb#21070
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Jan 8, 2018
RaftTombstoneKey was accidentally made a replicated key when it was first introduced, a problem we first realized existed when it was [included in snapshots]. At the time, we included workarounds to skip this key in various places (snapshot application, consistency checker) but of course we have failed to insert further hacks of the same kind elsewhere since (the one that prompting this PR being the stats recomputation on splits, which I'm looking into as part of cockroachdb#20181 -- unfortunately this commit doesn't seem to pertain to that problem) It feels sloppy that we didn't follow through back then, but luckily the damage appears to be limited; it is likely that the replicated existence of this key results in MVCCStats SysBytes inconsistencies, but as it happens, these stats are [already] [very] [inconsistent]. This commit does a few things: - renames the old tombstone key to `RaftIncorrectLegacyTombstoneKey` - introduces a (correctly unreplicated) `RaftTombstoneKey` - introduces a migration. Once activated, only the new tombstone is written, but both tombstones are checked. Additionally, as the node restarts, all legacy tombstones are replaced by correct non-legacy ones. - when applying a snapshot, forcibly delete any legacy tombstone contained within (even before the cluster version is bumped). This prevents new legacy tombstones to trickle on from other nodes. `RaftIncorrectLegacyTombstoneKey` can be purged from the codebase in binaries post v2.1, as at that point all peers have booted with a version that runs the migration. Thus, post v2.1, the replica consistency checker can stop skipping the legacy tombstone key. Fixes cockroachdb#12154. Release note: None [included in snapshots]: cockroachdb#12131 [already]: cockroachdb#20554 [very]: cockroachdb#20996 [inconsistent]: cockroachdb#21070
tbg
added a commit
to tbg/cockroach
that referenced
this pull request
Jan 9, 2018
RaftTombstoneKey was accidentally made a replicated key when it was first introduced, a problem we first realized existed when it was [included in snapshots]. At the time, we included workarounds to skip this key in various places (snapshot application, consistency checker) but of course we have failed to insert further hacks of the same kind elsewhere since (the one that prompting this PR being the stats recomputation on splits, which I'm looking into as part of cockroachdb#20181 -- unfortunately this commit doesn't seem to pertain to that problem) It feels sloppy that we didn't follow through back then, but luckily the damage appears to be limited; it is likely that the replicated existence of this key results in MVCCStats SysBytes inconsistencies, but as it happens, these stats are [already] [very] [inconsistent]. This commit does a few things: - renames the old tombstone key to `RaftIncorrectLegacyTombstoneKey` - introduces a (correctly unreplicated) `RaftTombstoneKey` - introduces a migration. Once activated, only the new tombstone is written, but both tombstones are checked. Additionally, as the node restarts, all legacy tombstones are replaced by correct non-legacy ones. - when applying a snapshot, forcibly delete any legacy tombstone contained within (even before the cluster version is bumped). This prevents new legacy tombstones to trickle on from other nodes. `RaftIncorrectLegacyTombstoneKey` can be purged from the codebase in binaries post v2.1, as at that point all peers have booted with a version that runs the migration. Thus, post v2.1, the replica consistency checker can stop skipping the legacy tombstone key. Fixes cockroachdb#12154. Release note: None [included in snapshots]: cockroachdb#12131 [already]: cockroachdb#20554 [very]: cockroachdb#20996 [inconsistent]: cockroachdb#21070
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Unfortunately, this discovered #20180: The stats essentially never match (or this PR has a bug).