-
Notifications
You must be signed in to change notification settings - Fork 4.1k
cloud/*: use smaller 5mb chunk size #115194
Copy link
Copy link
Closed
Labels
A-disaster-recoveryC-performancePerf of queries or internals. Solution not expected to change functional behavior.Perf of queries or internals. Solution not expected to change functional behavior.O-supportWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsP-1Issues/test failures with a fix SLA of 1 monthIssues/test failures with a fix SLA of 1 monthT-disaster-recovery
Metadata
Metadata
Assignees
Labels
A-disaster-recoveryC-performancePerf of queries or internals. Solution not expected to change functional behavior.Perf of queries or internals. Solution not expected to change functional behavior.O-supportWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsWould prevent or help troubleshoot a customer escalation - bugs, missing observability/tooling, docsP-1Issues/test failures with a fix SLA of 1 monthIssues/test failures with a fix SLA of 1 monthT-disaster-recovery
Currently we default to 8mb chunks in our uploads to cloud storage. When running many (e.g. 6 workers) per backup, and potentially concurrent backups, with each worker sitting on top of an SDK that may have chunks in progress or in queues, these larger chunks use more memory and can take longer to process, for example to hash a single chunk (see #115192 and #115189).
A smaller chunk size would slightly increase the number of parts we upload but for most of our ~128mb files, going to 5mb would mean something like 25 part uploads instead of 16. This seems minor compared to the memory and hashing benefits.
Note: s3 has a minimum chunk size of 5mb, so we only have room to cut the current 8mib default by 3/8th. But that still may be worth doing.
Jira issue: CRDB-33927