kvprober: implement "shadow write" probes

**Is your feature request related to a problem? Please describe.**
We have a `kvprober` that sends point read requests to "random" ranges. We should extent that prober to test the availability of a range at a write level. We can call this a "shadow write".

**Describe the solution you'd like**
Strawman proposal:

1. Implement a raft command called `Probe` / `ShadowWrite` and make available via the `kvclient` public API.
2. The MVP implementation of the command does nothing.
3. Extend `kvprober` to make `Probe` / `ShadowWrite` requests to "random" ranges.

The test of `kv` is decent. The `Probe` / `ShadowWrite` command needs to get proposed, agreed upon, applied, etc. (Am I using these words, correctly?) A write to the raft log will happen, so availability of the disk is checked. 

The test of `pebble` is minimal, as no actual write happens at `Probe` command apply time. Note though that we could change this in future CRDB versions. One can imagine writing to pebble but in a way that doesn't lead to user-visible side effects, in order to improve the realistic of the probe (in order to match the actual CRDB write codepath more closely).

CC @tbg @andreimatei  @knz @bdarnell @jreut @logston for review of the strawman proposal. I hope for a naming bikeshed.

Also, KV folks: How hard of a time do you think I will have implementing this? It's hard for me to scope the add `Probe` / `ShadowWrite` command part. My sense from talking with Ben a while back is that it's not technically hard really but lots of boilerplate and also a new command hasn't been added in a while so may be tricky to figure out all the places to make changes.

**Describe alternatives you've considered**
- We should also implement the stuck applied index + failing probe alert, which has faster mean time to detect, so long as the symptom experienced is a stuck applied index. Can't a link to an issue for that but it has been discussed.
- We should consider other similar approaches to above,  where some internal detail that is suspect (a stuck applied index) leads us to probe a specific range, leading to faster mean time to detect.

These aren't really alternatives tho. Blackbox approaches like this one are complimented by whitebox approaches.

**Additional context**
https://github.com/cockroachdb/cockroach/issues/61074
https://github.com/cockroachdb/cockroach/blob/master/pkg/kv/kvprober/kvprober.go


Epic CC-4054

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvprober: implement "shadow write" probes #67112

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kvprober: implement "shadow write" probes #67112

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions