Skip to content

c2c: add fingerprinting for internal testing #89336

@adityamaru

Description

@adityamaru

The following technical approach was replaced with:

...but the goal is the same.

Previous plan

crdb_internal.scan(crdb_internal.tenant_span($1)) returns the raw keys and values from the specified span. We would like to extend this generator to return timestamped ordered revisions of each key. While this is useful as a standalone tool, the motivation behind this change is to drive the on-demand fingerprinting that we are developing for c2c replication. At a high level this primitive will be executed by each processor on a pre-defined chunk of spans, the output of which will be fed to a checksum algorithm and sent downstream in the DistSQL flow.

As part of this issue, we should investigate whether we can add a mode to ExportRequest (KV request that is already capable of reading and returning timestamp-ordered revisions of keys) that does not write these keys to an SST but returns them in a kvBuf slice that we can read from. We should benchmark, and trace what parts of ExportRequest that are being used during backup are "slow" and "unnecessary" for simply reading and returning revisions of all rows in a given timebound.

Epic: CRDB-21075

Jira issue: CRDB-20208

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions