Skip to content

kvprober: crdb_internal.probe_ranges writes to global keyspace, will corrupt tables #101549

@nvb

Description

@nvb

kvprober derives target probe keys from range split points by scanning meta2. It is careful to wrap the split points using keys.RangeProbeKey. This constructs a probe key that is unique (only used by kvprober) and within the local keyspace, ensuring isolation from other workloads.

if rangeDesc.RangeID == 1 {
plans[i].Key = keys.RangeProbeKey(keys.MustAddr(keys.LocalMax))
} else {
plans[i].Key = keys.RangeProbeKey(rangeDesc.StartKey)
}

crdb_internal.probe_ranges() is a builtin function that manually performs kvprobes across the cluster. It was intended to work the same way. However, it is missing the keys.RangeProbeKey wrapping over range split points. As a result, it probes directly into the global keyspace.

key := desc.StartKey.AsRawKey()
if desc.RangeID == 1 {
// The first range starts at KeyMin, but the replicated keyspace starts only at keys.LocalMax,
// so there is a special case here.
key = keys.LocalMax
}
return p.rangeProber.RunProbe(ctx, key, p.isWrite)

op := r.ops.Read
if isWrite {
op = r.ops.Write
}
// NB: intentionally using a separate txn per probe to avoid undesirable cross-probe effects.
return r.db.Txn(ctx, op(key))

The consequence of this is that calls to crdb_internal.probe_range will place MVCC deletion tombstones at the start key of each range in the system. In some ranges (tsdb), this corruption will trigger assertion failures later on (#101440). In others (user tables), this will silently corrupt user data.

We need to fix this. When doing so, we should also add guardrails to prevent the kvprober module from touching the global keyspace.

Jira issue: CRDB-27004

Metadata

Metadata

Assignees

Labels

A-sql-builtinsSQL built-in functions and semantics thereof.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.S-0-corruption-or-data-lossUnrecoverable corruption, data loss, or other catastrophic issues that can’t be fixed by upgrading.branch-masterFailures and bugs on the master branch.branch-release-22.1Used to mark GA and release blockers, technical advisories, and bugs for 22.1branch-release-22.2Used to mark GA and release blockers, technical advisories, and bugs for 22.2branch-release-23.1Used to mark GA and release blockers, technical advisories, and bugs for 23.1release-blockerIndicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked.v23.1.2

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions