Enable cross-region writes in the S3 sink.#6323
Enable cross-region writes in the S3 sink.#6323dlvenable merged 1 commit intoopensearch-project:mainfrom
Conversation
Signed-off-by: David Venable <dlv@amazon.com>
| S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder() | ||
| final S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder() | ||
| .region(s3SinkConfig.getAwsAuthenticationOptions().getAwsRegion()) | ||
| .crossRegionAccessEnabled(true) |
There was a problem hiding this comment.
Is this intended behavior to choose the correct region for the bucket even though customer has configured with wrong region ?
When crossRegionAccessEnabled is set, the SDK automatically redirects requests to the correct bucket region. If a bucket doesn't exist in the configured region, the SDK uses the error response to identify the actual region and retries the request there.
There was a problem hiding this comment.
Yes, this is the intended behavior. And we have started doing this in the sink in #6083.
The region that the pipeline author configured is the first region attempted. If the bucket is not in that region and the IAM role has permissions for s3:GetBucketLocation, then S3 provides the region in a redirect. Then the SDK will resign for that region and send to that regional endpoint. It is that last step that this change provides.
Because the S3 sink supports dynamic bucket names, you might have a configuration that looks like this:
bucket: mycompany-${/aws/region}
This won't work with a single region.
|
|
||
| S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder() | ||
| final S3AsyncClientBuilder s3AsyncClientBuilder = S3AsyncClient.builder() | ||
| .region(s3SinkConfig.getAwsAuthenticationOptions().getAwsRegion()) |
There was a problem hiding this comment.
This one should be bucket's region - right?
There was a problem hiding this comment.
This is the region that is configured in the pipeline configuration. It should be the bucket's region, or a preferred region if the buckets are all scattered when using pipeline configurations. I suppose if you were running in us-west-2 and writing to buckets in us-east-1 and us-west-2, you'd probably want to start with the first.
Signed-off-by: David Venable <dlv@amazon.com>
Signed-off-by: David Venable <dlv@amazon.com> Signed-off-by: Nathan Wand <wandna@amazon.com>
Description
This enables cross-region writes in the S3 sink. This is similar to what we did for the S3 source in #6083, but now works for the sink.
Additionally, we only use the
S3AsyncClient, so I removed the old code for the sync client and merged the tests. I also updated the copyright headers since I was modifying this file.Issues Resolved
N/A
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.