Skip to content

Translog Optimisation: Upload .ckp file data as object metadata of .tlog file #13022

@skumawat2025

Description

@skumawat2025

Is your feature request related to a problem? Please describe

In the current implementation of remote translog store, for every translog generation we upload translog checkpoint(.ckp) file, translog(.tlog) file and translog_metadata file to remote store.

  1. translog.tlog - Contains the actual data and operations.
  2. translog.ckp - Contains the checkpoint information.
  3. translog metadata file - Contains metadata related to generation (readers that have not been committed) and generation to primary term mapping.

For each specified buffer interval we upload two file translog-.tlog and translog-.ckp parallely. After the successful upload of these two files. We prepare and further upload the latest metadata file. In total we try to upload 3 files to remote store per transfer for each shard. This increase the request count and rate during upload and download flows like sync translog, failover, peer recovery. This contributes to higher TPS load to remote store.

<Base Path>/
    <Index UUID>/
        <ShardID>/
            metadata/
                metadata__<primaryTerm>__<generation>__<createdTimestamp>__<nodeId>__<CURRENT_VERSION>
            data/
                <primaryTerm>/
                    translog-<gen>.ckp
                    translog-<gen>.tlog

Describe the solution you'd like

We can store checkpoint file information as object metadata of tlog file blob upload to remote store. Multiple repository providers like S3, Azure, and GCP provide object metadata with a size limit. Given that checkpoint file size remains under this size limit we can store ckp file with tlog file blob object metadata. With this approach we’ll reduce the number of request calls to remote store by 33%.

  • Azure - The metadata consists of name/value pairs. The total size of all metadata pairs can be up to 8KB in size. [REF]
  • S3 - Within the PUT request header, the user-defined metadata is limited to 2 KB in size. The size of user-defined. metadata is measured by taking the sum of the number of bytes in the UTF-8 encoding of each key and value. [REF]
  • GoogleCloud - The metadata size limit for google cloud storage is 8KB. [REF]
<Base Path>/
    <Index UUID>/
        <ShardID>/
            metadata/
                metadata__<primaryTerm>__<generation>__<createdTimestamp>__<nodeId>__<CURRENT_VERSION>
            data/
                <primaryTerm>/
                    translog-<gen>.tlog (with translog.ckp file as object metadata).

Related component

Storage:Performance

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

Storage:PerformanceenhancementEnhancement or improvement to existing feature or requestv2.15.0Issues and PRs related to version 2.15.0

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions