Skip to content

[C++] Use custom url for s3 using AWS_ENDPOINT_URL #36770

@Kotwic4

Description

@Kotwic4

Describe the enhancement requested

AWS_ENDPOINT_URL is now supported by the AWS for custom url (for example localhost).
More info about it docs and original github issue. It was merged into botocore in this pr.

What I can do with boto:

import os
import boto3

os.environ["AWS_ACCESS_KEY_ID"] = "ACCESS_KEY"
os.environ["AWS_SECRET_ACCESS_KEY"] = "SECRET_KEY"
os.environ["AWS_ENDPOINT_URL"] = "http://localhost:9000"

session = boto3.session.Session()
s3_client = session.client(
    service_name="s3",
)
print(s3_client.list_buckets()["Buckets"])

What I have to do in pyarrow:

import os
from pyarrow import fs

os.environ["AWS_ACCESS_KEY_ID"] = "ACCESS_KEY"
os.environ["AWS_SECRET_ACCESS_KEY"] = "SECRET_KEY"
os.environ["AWS_ENDPOINT_URL"] = "http://localhost:9000"

s3 = fs.S3FileSystem(endpoint_override=os.environ["AWS_ENDPOINT_URL"])
print(s3.get_file_info(fs.FileSelector("", recursive=False)))

What I would like to do in pyarrow:

import os
from pyarrow import fs

os.environ["AWS_ACCESS_KEY_ID"] = "ACCESS_KEY"
os.environ["AWS_SECRET_ACCESS_KEY"] = "SECRET_KEY"
os.environ["AWS_ENDPOINT_URL"] = "http://localhost:9000"

s3 = fs.S3FileSystem()
print(s3.get_file_info(fs.FileSelector("", recursive=False)))

This will allow me to use s3:// instead of creating file system

#current way
s3 = fs.S3FileSystem(endpoint_override=endpoint_url)
file = pq.ParquetFile('mybucket/my_file.parquet', filesystem=s3)

#possible future
file = pq.ParquetFile('s3://mybucket/my_file.parquet')

Component(s)

Python

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions