Skip to content

HTTP support for S3, GCS, and Azure.#2703

Merged
dwsteele merged 13 commits intopgbackrest:integrationfrom
wjmorland:http-support-ci
Dec 29, 2025
Merged

HTTP support for S3, GCS, and Azure.#2703
dwsteele merged 13 commits intopgbackrest:integrationfrom
wjmorland:http-support-ci

Conversation

@wjmorland
Copy link
Contributor

Add support for HTTP endpoints for S3, Azure, and GCS. Allow users to specify HTTP in the endpoint, but default to using HTTPS in all other scenarios to preserve existing behavior.

Extend the URL parser with a defaultType parameter to support either:

  • explicitly specifying a protocol via .type and enforcing that protocol is used in the URL
  • allowing protocol to be parsed from URL, but providing default via .defaultType if no protocol is found in the URL

Add partial write handling in fdWrite to support non-blocking socket operations. The write loop now handles EAGAIN errors by waiting for the file descriptor to become writable, and continues writing the remaining bytes when write() returns fewer bytes than requested. This is required for HTTP, which may use non-blocking sockets, but doesn't have built in handling like the TLS client we are using for HTTPS. Also wrap the write() call, add a shim, and additional logging for easier unit testing.

The main motivation for this is to allow pgBackRest to work natively with Ceph HTTP frontends. This was previously brought up in #2340 for a Ceph offshoot called MicroCeph, where it looks like they instead fixed the issue for that specific product on the Ceph side by adding HTTPS support in this PR. However, I think it would still be beneficial to have HTTP support so the problem is solved from both ways (and this would help my company's use case for backing up to our own private Ceph cluster that doesn't have HTTPS enabled). Also saw that there was some work done on this in #2587, but wasn't sure if that was actively being worked on, so wanted to propose another implementation + add support for Azure/Google Cloud.

Looks like everything is passing when I run the CI checks on my end (unit tests, code coverage, code-format) and did some manual testing with backing up one of my dev databases, seemed to be working well.

Add support for HTTP endpoints for S3, Azure, and GCS. Allow users
to specify HTTP in the endpoint, but default to using HTTPS in all
other scenarios to preserve existing behavior.

Extend the URL parser with a defaultType parameter to support either:
- explicitly specifying a protocol via .type and enforcing that
protocol is used in the URL
- allowing protocol to be parsed from URL, but providing default via
.defaultType if no protocol is found in the URL

Add partial write handling in fdWrite to support non-blocking socket
operations. The write loop now handles EAGAIN errors by waiting for
the file descriptor to become writable, and continues writing the
remaining bytes when write() returns fewer bytes than requested. This
is required for HTTP, which may use non-blocking sockets, but doesn't
have built in handling like the TLS client we are using for HTTPS. Also
wrap the write() call, add a shim, and additional logging for easier
unit testing.
@dwsteele dwsteele self-requested a review November 24, 2025 06:54
@dwsteele dwsteele self-assigned this Nov 24, 2025
@dwsteele dwsteele added this to the 2.58 milestone Nov 24, 2025
@dwsteele
Copy link
Member

dwsteele commented Nov 24, 2025

I must say this is a very well thought out and complete patch. Kudos! I see a few minor things I'll want changed but overall I don't see any barriers to getting this committed. Should be no problem for get this in for the January release.

@wjmorland
Copy link
Contributor Author

Thank you! That's great, looking forward to having this included.

Please let me know if you want me to address any of those minor issues, didn't see any comments on the code yet, so wasn't sure if you were planning on just fixing yourself before merging.

@dwsteele
Copy link
Member

I'll post a review soon once I have digested the patch a bit.

@4n3w
Copy link

4n3w commented Dec 3, 2025

Curious! Was this ever addressed in #1590 ? I'm here because we can't get http on a self-hosted minio server to play ball with pgbackrest. The link to the card in the old issue gives me a 404 🤷🏼‍♂️

@dwsteele
Copy link
Member

dwsteele commented Dec 4, 2025

@4n3w It wasn't addressed. That's what the current patch is about. There was also an earlier attempt that was abandoned by the author.

The link to the card in the old issue gives me a 404

At some point GitHub forced a migration to their new project system, which broke all the links.

Copy link
Member

@dwsteele dwsteele left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good overall but I think we need a few changes.

See comment I left about moving the helper function.

I also think that the S3/GCS/Azure tests should at least do a few operations with the test server using HTTP. I think your tests show the configuration is correct, but would still be better I think to have some actual tests.

@wjmorland
Copy link
Contributor Author

Thanks for the review! Makes sense on testing some actual operations with HTTP, added a few basic cases (read, write, error), and happy to add some of the more complex cases too if you want. Also made the helper function static as suggested.

- Rename fdWrite to ioFdWriteInternal for module naming consistency (lots of cascades from this)
- Minor formatting and code updates
- No need to start a new test -- combine into prior test
- Update comment
- Minor code updates
- Remove redundant tests (endpoint parsing is well covered)
- Move HTTP tests from HTTPS rather than duplicating tests
- Minor code updates
- Remove redundant tests (endpoint parsing is well covered)
- Move HTTP tests from HTTPS rather than duplicating tests
- Minor code updates
- Remove redundant tests (endpoint parsing is well covered)
- Move HTTP tests from HTTPS rather than duplicating tests

[also reordered some tests in S3 and Azure to the original order of where they were copied from]
@dwsteele dwsteele changed the title Add HTTP support and partial write handling. Add HTTP support for S3, GCS, and Azure. Dec 22, 2025
@dwsteele dwsteele linked an issue Dec 22, 2025 that may be closed by this pull request
Copy link
Member

@dwsteele dwsteele left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me now. See commits for my changes and reasons.

I know it looks like a lot but mostly it is renaming/reorganizing. The S3/GCS/Azure tests have a fair amount of churn so it is important to keep them as coherent as possible.

@wjmorland
Copy link
Contributor Author

Sounds good, I see how that makes the S3/GCS/Azure tests much cleaner, thanks for making those changes

@dwsteele dwsteele changed the title Add HTTP support for S3, GCS, and Azure. HTTP support for S3, GCS, and Azure. Dec 29, 2025
@dwsteele dwsteele merged commit 2398dd4 into pgbackrest:integration Dec 29, 2025
19 checks passed
@dwsteele
Copy link
Member

Committed. Thank you!

@wjmorland
Copy link
Contributor Author

Awesome, thanks for including this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

About pgbackrest supports the http protocol

3 participants