Description
for remote stores, we currently support Azure, AWS, and GCP which have the following uri schemes:
- AWS:
s3://<bucket>/path/to/table
- GCP:
gs://<bucket>/path/to/table
- Azure
adls2://<account>/<container>/path/to/table
The main source of difference is that - to the best of my knowledge - the concept of an account does not exists for s3/gs. Essentially buckets must be unique for a region, where containers must be unique per account. However regions also exist in azure. On the other hand, to root of an object store is bucket / container, and also from how urls / paths are constructed bucket and container are more or less the same. It seems others (see adlfs) felt like accounts are the appropriate lowest level in the path / uri where the account (much like region in S3) is configuration of the store.
Thus I propose to "drop" the account from our azure paths. While this is certainly a major breaking change, my hope is that users appreciate consistency with e.g. fsspec. Given that we aim to closely integrate with (py)arrow, it seems to me that this would be more consistent on that level as well.
From an implementation standpoint, we are already picking up the account from configuration, so the path segement is effectively unused.
As a side note - this would also be consistent in how object_store treats paths ...
cc @thovoll, @wjones127 @houqp
Use Case
have a nicer user facing API.
Related Issue(s)
Description
for remote stores, we currently support Azure, AWS, and GCP which have the following uri schemes:
s3://<bucket>/path/to/tablegs://<bucket>/path/to/tableadls2://<account>/<container>/path/to/tableThe main source of difference is that - to the best of my knowledge - the concept of an account does not exists for s3/gs. Essentially buckets must be unique for a region, where containers must be unique per account. However regions also exist in azure. On the other hand, to root of an object store is bucket / container, and also from how urls / paths are constructed bucket and container are more or less the same. It seems others (see adlfs) felt like accounts are the appropriate lowest level in the path / uri where the account (much like region in S3) is configuration of the store.
Thus I propose to "drop" the account from our azure paths. While this is certainly a major breaking change, my hope is that users appreciate consistency with e.g. fsspec. Given that we aim to closely integrate with (py)arrow, it seems to me that this would be more consistent on that level as well.
From an implementation standpoint, we are already picking up the account from configuration, so the path segement is effectively unused.
As a side note - this would also be consistent in how
object_storetreats paths ...cc @thovoll, @wjones127 @houqp
Use Case
have a nicer user facing API.
Related Issue(s)