Skip to content

azure: use PutBlob API for uploads instead of PutBlock API + PutBlock List API#5544

Merged
MichaelEischer merged 6 commits into
restic:masterfrom
zmanda:fix-gh-5531-azure-backend-upgrade-service-version
Oct 12, 2025
Merged

azure: use PutBlob API for uploads instead of PutBlock API + PutBlock List API#5544
MichaelEischer merged 6 commits into
restic:masterfrom
zmanda:fix-gh-5531-azure-backend-upgrade-service-version

Conversation

@konidev20

@konidev20 konidev20 commented Oct 2, 2025

Copy link
Copy Markdown
Contributor

What does this PR change? What problem does it solve?

The Azure Storage backend will now support uploading pack files in a single API call using the PutBlob API. This will reduce the cost implications of the current approach by 50% where a combination of PutBlock and Put Block List API is used. (Azure charges per operation)

Was the change previously discussed in an issue or on the forum?

Closes #5531

Checklist

  • [ ] I have added tests for all code changes. Existing test cases cover the changes.
  • [ ] I have added documentation for relevant changes (in the manual). The changes are not user facing. It's an internal optimization.
  • There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
  • I'm done! This pull request is ready for review.

@MichaelEischer MichaelEischer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason to keep any method other than auto. So let's get rid of old upload path and just use the auto variant. More variants just mean more code that can cause problems.

@R23-Git

R23-Git commented Oct 3, 2025

Copy link
Copy Markdown

Hi @konidev20 ,

Thanks again for your time. This new method is working.

After some analysis, please consider this:
I really think that the original saveSmall and saveLarge functions are already there to differentiate PutBlob and PutBlock, considering the limits of the service Versions 2016-05-31 through 2019-07-07. But saveSmall do not use PutBlob in this implementation (maybe it was not implemented in the Azure SDK back then).

So maybe you just need to make some changes in azure.go:

  • const saveLargeSize = 256 * 1024 * 1024 to const saveLargeSize = 5000 * 1024 * 1024 (line 44)
  • Adapt saveSmall to use the correct Upload Azure SDK subfunction (for PutBlob API)
  • buf := make([]byte, 100*1024*1024) to buf := make([]byte, 4000*1024*1024) into saveLarge function (line 299)

As you can see, values are matching the service versions table:

Service version Maximum block size (via Put Block) Maximum blob size (via Put Block List) Maximum blob size via single write operation (via Put Blob)
Version 2019-12-12 and later 4,000 mebibytes (MiB) Approximately 190.7 TiB (4,000 MiB × 50,000 blocks) 5,000 MiB
Versions 2016-05-31 through 2019-07-07 100 MiB Approximately 4.75 TiB (100 MiB × 50,000 blocks) 256 MiB
Versions earlier than 2016-05-31 4 MiB Approximately 195 GiB (4 MiB × 50,000 blocks) 64 MiB

Also, it could be great to let the end user to choose the size (under the limits of course) with 2 options: (seems Rclone do it like that):

  • A cut size option to choose when we want to use PutBlob / PutBlock,
    ie: -o azure.cutsize=512M would force to use PutBlob if file is under 512MiB, and PutBlock if over.
    This option can't be over 5000MiB (saveLargeSize variable)
  • A chunk size option to choose how to split block via the PutBlock API,
    ie: -o azure.chunksize=16M would force 16MiB blocks via PutBlock API, so cumulated with the previous option, a pack file over 512MiB will be split into 16MiB blocks then a PutBlockList will be emitted to produce the whole file into the storage.
    This option can't be over 4000MiB.

This will ensure that anyone can set the values considering there needs and/or limits (seems that old Service Version of AzureStorage are still running. Also limits are not the same for the Emulator:
image

@konidev20

Copy link
Copy Markdown
Contributor Author

The Windows test failing in unrelated.

Also, it could be great to let the end user to choose the size (under the limits of course) with 2 options: (seems Rclone do it like that):
A cut size option to choose when we want to use PutBlob / PutBlock,
ie: -o azure.cutsize=512M would force to use PutBlob if file is under 512MiB, and PutBlock if over.
This option can't be over 5000MiB (saveLargeSize variable)
A chunk size option to choose how to split block via the PutBlock API,
ie: -o azure.chunksize=16M would force 16MiB blocks via PutBlock API, so cumulated with the previous option, a pack file over 512MiB will be split into 16MiB blocks then a PutBlockList will be emitted to produce the whole file into the storage.
This option can't be over 4000MiB.

@R23-Git
I think this would be a neat feature to implement in a new feature request maybe? To limit the scope of changes in this PR. This PR is essentially just replacing the implementation of the saveSmall function.

@MichaelEischer

Copy link
Copy Markdown
Member

What would be the actual benefit of adding these options? Users already have the option to set the pack file size target via --pack-size. Almost all files created by restic remain near the pack file size target (currently limited to 128MB). While I plan to increase the pack file size in the future, it will never be larger than either 1 or 2 GB. In a few exceptional cases the pack files can be larger than that limit. But that isn't something we need an extra configuration option for.

According to https://learn.microsoft.com/en-us/rest/api/storageservices/versioning-for-the-azure-storage-services all regions use a 2025 version by now.

Keeping support for the emulator sounds reasonable though, so what about only cutting files into smaller parts that are larger than 2000MB?

I think this would be a neat feature to implement in a new feature request maybe?

I'm against adding this feature unless someone can provide a reason why it's actually necessary.

Comment thread internal/backend/azure/azure.go Outdated
Comment thread internal/backend/azure/azure.go
@konidev20 konidev20 force-pushed the fix-gh-5531-azure-backend-upgrade-service-version branch from 6275024 to eb72ca5 Compare October 5, 2025 06:29
@konidev20 konidev20 changed the title azure: allow ability to change upload method azure: use PutBlob API for uploads instead of PutBlock API + PutBlock List API Oct 5, 2025
Comment thread internal/backend/azure/azure.go Outdated

@MichaelEischer MichaelEischer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the old blob size of 100MB for large uploads. Other than that the change looks fine.

@konidev20

Copy link
Copy Markdown
Contributor Author

Let's keep the old blob size of 100MB for large uploads. Other than that the change looks fine.

I will need to update the docs and add a changelog. Will do that and mark this PR ready for review.

Thanks for the review!

@konidev20 konidev20 force-pushed the fix-gh-5531-azure-backend-upgrade-service-version branch from d29e15f to f9ff230 Compare October 5, 2025 16:22
@konidev20 konidev20 marked this pull request as ready for review October 5, 2025 16:22

@MichaelEischer MichaelEischer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@MichaelEischer MichaelEischer merged commit 1ef785d into restic:master Oct 12, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update AzureBlob upload API to use PutBlob instead of PutBlock+PutBlockList

3 participants