-
Notifications
You must be signed in to change notification settings - Fork 594
Description
Currently, for both FasterLog and Faster-KV checkpoints, the system flushes all content first before writing commit metadata to a separate file. Source of truth for commit is the existence of written metadata, and therefore metadata can only be written after all committed data is flushed. This is 2 I/O roundtrips and are not parallelizable; in high-latency storage (e.g., Azure Blobs) this is a big issue.
To avoid this --- proposal is to use special commit records in the log itself to denote committed content. Commit is declared once all data + commit record is on disk, and commit metadata is written off the critical path. On recovery, we find the closest persisted commit metadata and scan forward to see if we can find a more recent commit record. This approach eliminates the second critical-path I/O roundtrip, but only lowers recovery performance if a crash happens to occur between the two flushes of recovered commit.