Skip to content

Enable cross-region writes in the S3 sink.#6323

Merged
dlvenable merged 1 commit intoopensearch-project:mainfrom
dlvenable:s3-sink-cross-region
Dec 10, 2025
Merged

Enable cross-region writes in the S3 sink.#6323
dlvenable merged 1 commit intoopensearch-project:mainfrom
dlvenable:s3-sink-cross-region

Conversation

@dlvenable
Copy link
Copy Markdown
Member

@dlvenable dlvenable commented Dec 3, 2025

Description

This enables cross-region writes in the S3 sink. This is similar to what we did for the S3 source in #6083, but now works for the sink.

Additionally, we only use the S3AsyncClient, so I removed the old code for the sync client and merged the tests. I also updated the copyright headers since I was modifying this file.

Issues Resolved

N/A

Check List

  • New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: David Venable <dlv@amazon.com>
S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder()
final S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder()
.region(s3SinkConfig.getAwsAuthenticationOptions().getAwsRegion())
.crossRegionAccessEnabled(true)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended behavior to choose the correct region for the bucket even though customer has configured with wrong region ?

When crossRegionAccessEnabled is set, the SDK automatically redirects requests to the correct bucket region. If a bucket doesn't exist in the configured region, the SDK uses the error response to identify the actual region and retries the request there.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the intended behavior. And we have started doing this in the sink in #6083.

The region that the pipeline author configured is the first region attempted. If the bucket is not in that region and the IAM role has permissions for s3:GetBucketLocation, then S3 provides the region in a redirect. Then the SDK will resign for that region and send to that regional endpoint. It is that last step that this change provides.

Because the S3 sink supports dynamic bucket names, you might have a configuration that looks like this:

bucket: mycompany-${/aws/region}

This won't work with a single region.


S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder()
final S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder()
.region(s3SinkConfig.getAwsAuthenticationOptions().getAwsRegion())
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should be bucket's region - right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the region that is configured in the pipeline configuration. It should be the bucket's region, or a preferred region if the buckets are all scattered when using pipeline configurations. I suppose if you were running in us-west-2 and writing to buckets in us-east-1 and us-west-2, you'd probably want to start with the first.

@dlvenable dlvenable merged commit 66a6191 into opensearch-project:main Dec 10, 2025
46 of 47 checks passed
eatulban pushed a commit to eatulban/data-prepper that referenced this pull request Dec 11, 2025
wandna-amazon pushed a commit to wandna-amazon/data-prepper that referenced this pull request Jan 8, 2026
Signed-off-by: David Venable <dlv@amazon.com>
Signed-off-by: Nathan Wand <wandna@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants