Skip to content

filesink: add IAM support for S3#3280

Merged
danielnelson merged 7 commits intomainfrom
dbn/materialize-s3-iam
Sep 15, 2025
Merged

filesink: add IAM support for S3#3280
danielnelson merged 7 commits intomainfrom
dbn/materialize-s3-iam

Conversation

@danielnelson
Copy link
Copy Markdown
Contributor

@danielnelson danielnelson commented Sep 11, 2025

Description:

This adds support for AWS IAM to materialize-s3-parquet and materialize-s3-csv. Users who wish to use IAM authentication will need to configure a Role according to our documentation.

related: estuary/flow#2123

Workflow steps:

NA

Documentation links affected:

Notes for reviewers:

AWSAccessKeyID string `json:"awsAccessKeyId" jsonschema:"title=AWS Access Key ID,description=Access Key ID for writing data to the bucket." jsonschema_extras:"order=1"`
AWSSecretAccessKey string `json:"awsSecretAccessKey" jsonschema:"title=AWS Secret Access key,description=Secret Access Key for writing data to the bucket." jsonschema_extras:"secret=true,order=2"`
Region string `json:"region" jsonschema:"title=Region,description=Region of the bucket to write to." jsonschema_extras:"order=3"`
AWSRegion string `json:"awsRegion" jsonschema:"title=Region,description=Region of the bucket to write to." jsonschema_extras:"order=3"`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat longer term, I think this S3Store should be updated to use our blob.Bucket implementation, which has automatic detection of the bucket's region, using this.

Thinking about this more, in such a future we'd still want a user to be able to override this automatically detected region for the bucket. There's no concrete reason I can think of for doing this right now, but it may be applicable if we want to support non-AWS endpoints, or more exotic types of AWS endpoints.

So I think we'll eventually make the region input for the "bucket" optional, and I'm thinking it would be best to leave it in the top-level configuration, since it is specific to the bucket itself rather than the type of authentication. This would leave us with there being a region input at both the top-level and for the IAM authentication (if configured), which is kind of annoying, but I think we can live with that if the ultimate goal is make the region in the top-level optional anyway.

One more thing that I'm not 100% sure of is if the bucket can have a different region than the IAM authentication?

I think this is a little different than my previous advice but after giving it more thought this is where I'm at; apologies for the waffling around.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I don't think the double entry of the region will be too awkward as its only for IAM users.

One more thing that I'm not 100% sure of is if the bucket can have a different region than the IAM authentication?

I believe this would work, as according to Amazon:

Session tokens from Regional AWS STS endpoints are valid in all AWS Regions.

@danielnelson
Copy link
Copy Markdown
Contributor Author

danielnelson commented Sep 12, 2025

I assume due to embedding of config objects, it is important not to use the same string in the credentials as is used in the IAMConfig. Otherwise the session strings aren't injected. I was hoping to use a consistent naming style for all values in the CredentialsConfig, but ended up unwinding this in 78d075a.

Copy link
Copy Markdown
Member

@williamhbaker williamhbaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if c.AWSSecretAccessKey == "" {
return errors.New("missing 'awsSecretAccessKey'")
}
} else {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could simplify this a tiny bit, using else if err := c.Credentials.Validate(); err != nil { ...}

This adds support for AWS IAM to materialize-s3-parquet and
materialize-s3-csv.  Users who wish to use IAM authentication will need
to configure a Role according to our documentation[1].

[1]: https://docs.estuary.dev/guides/iam-auth/aws/
I need to use the full IAMConfig so that there are locations to receive
these values.  Instead of assuming the role in the connector, we just
use these as static credentials.
This is an experiment to see if there is an conflict between these
fields and the ones sent by the runtime.
@danielnelson danielnelson merged commit 95a05b0 into main Sep 15, 2025
60 of 64 checks passed
@danielnelson danielnelson deleted the dbn/materialize-s3-iam branch September 15, 2025 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants