Skip to content

[Bug]: Protocol lost using partitioning mechanism whilst uploading to S3 #1750

@kane-menicou

Description

@kane-menicou

What happened?

I am currently getting error Scheme "file://" is not supported by this protocol. Expected scheme is "aws-s3://" . When attempting to upload to S3 with partitions. @norberttech mentioned it 'seems that there might be a bug in partitioning mechanism when uploading to S3 (seems that it doesn't keep protocol)'.

edit:
Discord thread for context: https://discord.com/channels/1263821077558333451/1389631343305949336

How to reproduce?

$config = config_builder()
    ->mount(
    aws_s3_filesystem(
        $this->bucketName,
        aws_s3_client(
            [
                'region' => $this->region,
                'accessKeyId' => $this->accessKeyId,
                'accessKeySecret' => $this->accessKeySecret,
            ],
        ),
    ),
);

$frame = data_frame($config)
    ->extract(from_array($dataAsArray))
    ->batchSize(10)
    ->mode(overwrite())
    ->withEntry(
        'year',
        ref('createdAt')->cast('date')->dateFormat('Y'),
    )
    ->withEntry(
        'month',
        ref('createdAt')->cast('date')->dateFormat('m'),
    )
    ->withEntry(
        'day',
        ref('createdAt')->cast('date')->dateFormat('d'),
    )
    ->partitionBy(
        ref('year'),
        ref('month'),
        ref('day'),
    )
    ->map(
        fn(Row $row): Row => $row->remove(ref('year'), ref('month'), ref('day')),
    )
    ->load(to_parquet(path("aws-s3://test.parquet")))
    ->run()
;

Data required to reproduce bug locally

[
['createdAt' => '2000-01-13'],
['createdAt' => '2012-01-13'],
]

Version

0.19.*

Relevant error output

In Protocol.php line 36:
                                                                                      
  Scheme "file://" is not supported by this protocol. Expected scheme is "aws-s3://"

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions