Skip to content

storage: paginate export requests with a size limit #43356

@ajwerner

Description

@ajwerner

Is your feature request related to a problem? Please describe.

In order to bound memory usage during backup and then ultimately during restore, we need to bound the size of exported SSTs. Currently in the implementation of export we build SSTs in memory and then write them to external storage (or in rare cases return them to clients).

The logic which creates these SSTs lives here:

data, summary, err := e.ExportToSst(args.Key, args.EndKey, args.StartTime, h.Timestamp, exportAllRevisions, io)

In today's implementation the entire range of [args.Start, args.End) will be written to a single SST. Today's ranges are generally bound to 64MB (in the happy case) which provides something of an upper bound for the size of SSTs created during for a backup. As we look towards moving to larger range sizes (#39717), putting the entire range into a single SST becomes problematic.

Once we do this we can be confident that SST files created for backups are not larger than today's files.
Describe the solution you'd like

The proposal in this issue is to:

  1. Add a cluster setting to dictate the maximum size of an SST for export (there are other options to plumb in such a number but a cluster setting seems the easiest).
  2. Modify the engine.ExportToSst interface to accept a size limit and return a resume key
  3. Plumb the maximum size from the cluster setting into the ExportToSst call in evalExport and paginate the export across multiple files
    • The ExportResponse is already combinable so there should be no need to change the API whatsoever.

Describe alternatives you've considered

The proposal here will ensure that larger ranges do not make the memory situation worse than it is today for BACKUP and RESTORE. There are approaches which could make the situation better. Ideally we'd stream the SST straight to the storage endpoint rather than buffering it in RAM completely.

Additional context

Once this is in place we'll additionally want to split up spans in the backup at file boundaries; that is a tiny change.

Another user of export requests is CDC which uses them for backfills. It too should make sure to not buffer too much data in ram. To achieve that goal it may need to receive the resume key in the response and provide a way to indicate that it does not want the entire response. That can be follow up work.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions