-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I am trying to query parquet files in S3 from the CLI. Some work, and some do not.
To Reproduce
DataFusion CLI v12.0.0
❯ create external table test stored as parquet location 's3://nyc-tlc/trip data/yellow_tripdata_2022-06.parquet';
ObjectStore(Generic { store: "S3", source: MissingLastModified })
However, if I download the file locally it works.
$ aws s3 cp "s3://nyc-tlc/trip data/yellow_tripdata_2022-06.parquet" /tmp/yellow_tripdata_2022-06.parquet
download: s3://nyc-tlc/trip data/yellow_tripdata_2022-06.parquet to ../../../../../../tmp/yellow_tripdata_2022-06.parquetataFusion CLI v12.0.0
❯ create external table test stored as parquet location '/tmp/yellow_tripdata_2022-06.parquet';
0 rows in set. Query took 0.006 seconds.
❯ select * from test limit 10;
+----------+----------------------+-----------------------+-----------------+---------------+------------+--------------------+--------------+--------------+--------------+-------------+-------+---------+------------+--------------+-----------------------+--------------+----------------------+-------------+
| VendorID | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count | trip_distance | RatecodeID | store_and_fwd_flag | PULocationID | DOLocationID | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_surcharge | total_amount | congestion_surcharge | airport_fee |
+----------+----------------------+-----------------------+-----------------+---------------+------------+--------------------+--------------+--------------+--------------+-------------+-------+---------+------------+--------------+-----------------------+--------------+----------------------+-------------+
| 1 | 2022-06-01 00:25:41 | 2022-06-01 00:48:22 | 1 | 11 | 1 | N | 70 | 48 | 1 | 32 | 3 | 0.5 | 2 | 6.55 | 0.3 | 44.35 | 2.5 | 0 |
| 1 | 2022-06-01 00:44:40 | 2022-06-01 01:01:48 | 1 | 4.2 | 1 | N | 170 | 226 | 1 | 14 | 3 | 0.5 | 0 | 0 | 0.3 | 17.8 | 2.5 | 0 |
| 2 | 2022-06-01 00:23:07 | 2022-06-01 00:39:50 | 1 | 9.49 | 1 | N | 264 | 113 | 1 | 26 | 0.5 | 0.5 | 5 | 6.55 | 0.3 | 42.6 | 2.5 | 1.25 |
| 1 | 2022-06-01 00:25:53 | 2022-06-01 00:57:06 | 2 | 12.1 | 1 | N | 132 | 17 | 2 | 37 | 1.75 | 0.5 | 0 | 0 | 0.3 | 39.55 | 0 | 1.25 |
| 1 | 2022-06-01 00:23:58 | 2022-06-01 00:33:43 | 0 | 1.8 | 1 | N | 140 | 163 | 1 | 9 | 3 | 0.5 | 2.55 | 0 | 0.3 | 15.35 | 2.5 | 0 |
| 2 | 2022-06-01 00:01:27 | 2022-06-01 00:10:53 | 1 | 2.02 | 1 | N | 148 | 158 | 1 | 9 | 0.5 | 0.5 | 0.64 | 0 | 0.3 | 13.44 | 2.5 | 0 |
| 2 | 2022-06-01 00:16:25 | 2022-06-01 00:40:45 | 1 | 8.08 | 1 | N | 158 | 116 | 1 | 26.5 | 0.5 | 0.5 | 7.58 | 0 | 0.3 | 37.88 | 2.5 | 0 |
| 1 | 2022-06-01 00:11:08 | 2022-06-01 00:27:02 | 1 | 4.3 | 1 | N | 246 | 262 | 1 | 15 | 3 | 0.5 | 3.75 | 0 | 0.3 | 22.55 | 2.5 | 0 |
| 2 | 2022-06-01 00:21:42 | 2022-06-01 00:42:01 | 1 | 8.78 | 1 | N | 197 | 191 | 1 | 26.5 | 0.5 | 0.5 | 5.56 | 0 | 0.3 | 33.36 | 0 | 0 |
| 2 | 2022-06-01 00:23:05 | 2022-06-01 00:30:45 | 1 | 1.76 | 1 | N | 48 | 186 | 1 | 7.5 | 0.5 | 0.5 | 2.26 | 0 | 0.3 | 13.56 | 2.5 | 0 |
+----------+----------------------+-----------------------+-----------------+---------------+------------+--------------------+--------------+--------------+--------------+-------------+-------+---------+------------+--------------+-----------------------+--------------+----------------------+-------------+
10 rows in set. Query took 1.792 seconds.
Expected behavior
Should work
Additional context
None
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working