-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Is your feature request related to a problem or challenge?
There is a noticeable inconsistency between the syntax of COPY TO and CREATE EXTERNAL TABLE commands, particularly in how they handle the specification of data format and partitioning. The COPY TO command allows for these details to be included as part of a flexible options list, while CREATE EXTERNAL TABLE parses these details directly from the SQL statement.
Detailed Description:
-
COPY TO Command Syntax:
COPY (VALUES (1, 'a', 'x'), (2, 'b', 'y'), (3, 'c', 'z')) TO 'test_files/scratch/copy/partitioned_table2/' (FORMAT PARQUET, PARTITION_BY 'column2, column3');
In this command, the format and partitioning are specified as options within the command itself.
-
CREATE EXTERNAL TABLE Command Syntax:
CREATE EXTERNAL TABLE validate_partitioned_parquet STORED AS PARQUET LOCATION 'test_files/scratch/copy/partitioned_table1/' PARTITIONED BY (col2);
Here, the format and partitioning details are part of the SQL command's structure, without the flexibility offered by an options list.
This syntactical inconsistency creates a learning curve and potential for confusion, as users must adapt to two different methods for specifying critical information like data format and partitioning, depending on the command they are using.
Describe the solution you'd like
I propose that a unified syntax approach be adopted for both commands. This could mean revising COPY TO to align with the direct parsing method of CREATE EXTERNAL TABLE. Additionally, updating the parser to support this unified syntax and providing detailed documentation for the same would be beneficial.
Describe alternatives you've considered
No response
Additional context
No response