Skip to content

Fix EncryptedBlobStore support for s3 compatible backends without eTag parts number suffix#825

Merged
gaul merged 1 commit intogaul:masterfrom
polarctos:fix-s3-api-encrypted-multipart-parts-number
Jun 12, 2025
Merged

Fix EncryptedBlobStore support for s3 compatible backends without eTag parts number suffix#825
gaul merged 1 commit intogaul:masterfrom
polarctos:fix-s3-api-encrypted-multipart-parts-number

Conversation

@polarctos
Copy link
Contributor

@polarctos polarctos commented Jun 11, 2025

Some mostly S3 REST API compatible storage backends do not return the number of multipart upload parts as a suffix to the eTag as Amazon does and as the previous code expects. An example for this is NetApp StorageGRID S3 REST API.

The old code had a fallback to just assume one encrypted part, but this is actually wrong in the multipart upload case for these backends, thus this problematic behaviour is replaced in this commit.

Example eTag structure from real AWS S3 for multipart uploads with suffix for 2 parts: xyzabc-2

In this case the number of parts is also not present in the object metadata, as metadata can only be set on S3 API when starting the multipart upload, and not when completing it, thus the number of parts is not yet known at this earlier point in time.

This change adds a third fallback option to just read the number of encrypted parts from the final part padding, which is the only place guaranteed to always be present for encrypted blobs.

Only for these backends this adds an additional GET request to the backend, but only for the actual 64 bytes of the padding. Before this change, the returned sizes were not correct for multipart uploads with more than one part on these backends without eTag parts number suffix.

@gaul
Copy link
Owner

gaul commented Jun 11, 2025

I can't look at this right now but does the undocumented S3 API to return the number of parts help here?

https://github.com/gaul/undocumented-s3-apis?tab=readme-ov-file#get-object-by-multipart-number

…g suffix

Some mostly S3 REST API compatible storage backends do not return the number of multipart upload parts as a suffix to the eTag as Amazon does and as the previous code expects. An example for this is NetApp StorageGRID S3 REST API.

The old code had a fallback to just assume one encrypted part, but this is actually wrong in the multipart upload case for these backends, thus this is replace in this commit.

Example eTag structure from real AWS S3 for multipart uploads with 2 parts:`xyzabc-2`

In this case the number of parts is also not present in the object metadata, as metadata is set on S3 API when starting the multipart upload, and not when completing it, thus the number of parts is not yet known at this earlier point in time.

This change adds a third fallback option to just read the number of parts from the final part padding, which is the only place guaranteed to always be present for encrypted blobs.

Only for these backends this adds an additional GET request to the backend, but only for the actual 64 bytes of the padding.
@polarctos polarctos force-pushed the fix-s3-api-encrypted-multipart-parts-number branch from 5f931a3 to b420e92 Compare June 12, 2025 11:42
@polarctos
Copy link
Contributor Author

I can't look at this right now but does the undocumented S3 API to return the number of parts help here?

https://github.com/gaul/undocumented-s3-apis?tab=readme-ov-file#get-object-by-multipart-number

Unfortunately the x-amz-mp-parts-count response header is also not present from thios backend. There is only:

x-amz-request-id: 1749728871401780
x-amz-id-2: 12575733

As neither this additional header nor the eTag suffix are really specified by the S3 API common response headers I guess there can also be more "mostly S3 compatible" storages that don't have these for multipart uploads.
As the number of parts is also in the last part padding too for encrypted blobs, which is just a GET request for the last 64 bytes, I think thats acceptable as an additional GET request for such backends to make the encryption middleware compatible with these too.

I rebased on master, to get the Python yield CI fix, could you please trigger to run the GitHub Actions again?

@gaul gaul merged commit 1003265 into gaul:master Jun 12, 2025
3 checks passed
@gaul
Copy link
Owner

gaul commented Jun 12, 2025

Thank you for your contribution @polarctos!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants