Skip to content

[C#] Checkpoint/Commit Critical Path Optimization #584

@tli2

Description

@tli2

Currently, for both FasterLog and Faster-KV checkpoints, the system flushes all content first before writing commit metadata to a separate file. Source of truth for commit is the existence of written metadata, and therefore metadata can only be written after all committed data is flushed. This is 2 I/O roundtrips and are not parallelizable; in high-latency storage (e.g., Azure Blobs) this is a big issue.

To avoid this --- proposal is to use special commit records in the log itself to denote committed content. Commit is declared once all data + commit record is on disk, and commit metadata is written off the critical path. On recovery, we find the closest persisted commit metadata and scan forward to see if we can find a more recent commit record. This approach eliminates the second critical-path I/O roundtrip, but only lowers recovery performance if a crash happens to occur between the two flushes of recovered commit.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions