Skip to content

Adding tolerant reader pattern feature #1564

@janwesterkamp

Description

@janwesterkamp

Bucket names following a naming convention - which is good.
In some cases, object store implementations do not follow this naming convention, i.e. Ceph allows to set a flag to be able to ignoring it.
As a user, I would still be able to access such a (misconfigured) object store, at least with read access - meaning having the reader to be tolerant to this.

In Detail:

The naming convention for AWS S3 compatible APIs does not allow uppercase letters to be used on S3 compatible object stores:

https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html
https://docs.ceph.com/en/latest/radosgw/s3/bucketops/#constraints

To make these names possible in Ceph, a separate flag (see note in link above about rgw_relaxed_s3_bucket_names flag) need to be set, to allow violating against this convention, i.e. allowing longer names or using uppercase letters for buckets.

In my concrete example I would like to access Copernicus Data Space Environment (CDSE) S3 Access service, which offers public access to earth observation data: https://documentation.dataspace.copernicus.eu/APIs/S3.html

Unfortunately it looks like CloudFerro as provider of this service used the Ceph flag mentioned above and uppercase letters for the bucket names and it is unclear if or when this will be fixed.

The S3 Access API offers these public available buckets:

EODATA
DIAS

At the moment, MinIO Java SDK and MinIO Client (mc) will not be able to access this object store instance (one of the biggest Ceph instances in the world, following CloudFerro).
While there are alternatives on the CLI (s3cmd, AWS CLI) that already implement the tolerant reader pattern, with Java it's only the AWS Java SDK (using the uppercase letters, but with other inconveniences for like accessing non-AWS resources), as far as I know.
However, listing the buckets with naming convention violation is possible with MinIO tooling, but not accessing objects inside.

So I would appreciate having the tolerant reader pattern implemented in the MinIO tooling to have a more lightweight access to data in environments like this.

As save way to achieve this might be to keep the strict setting by default and having a (tolerantReader) flag to be set to be able to access data violating the naming convention.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions