Skip to content

replay: add facilities to replay a workload #2056

@jbowens

Description

@jbowens

As a part of #1865, once we've collected a workload (#2050) we need facilities for replaying the workload in Pebble.

Replay should begin from the initial database state recorded by the workload. The workloads collected MANIFESTs provide a timeline of the changes to the database over the course of the original workload's execution. The replay tool should step through all the manifests version edits in order. Some prefix of the collected MANIFESTs may have already been applied in the initial database checkpoint. The replay tool should skip this prefix by beginning replaying after the last version edit recorded in the initial database state.

During replay, any version edits that don't correspond to a flush or ingest can be skipped.

Ingests: To replay an ingest, replaying should re-ingest the referenced sstables which must have been collected as a part of workload capture.
Flushes: To replay a flush, replaying should read all of the flushed sstables copying the keys into a new batch, commit the batch and force a flush with DB.Flush.

Pacing should be generalized so that we can implement several pacing strategies.

Metadata

Metadata

Assignees

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions