Skip to content

Error writing STRUCT to parquet in parallel: internal error: entered unreachable code: cannot downcast Int64 to byte array #8853

@alamb

Description

@alamb

Describe the bug

I can't write a struct to parquet when trying to write in parallel, instead I get the following error

internal error: entered unreachable code: cannot downcast Int64 to byte array

To Reproduce

$ datafusion-cli
DataFusion CLI v34.0.0
❯ create table t as values (struct ('foo', 1)), (struct('bar', 2));
0 rows in set. Query took 0.004 seconds.

❯ select * from t;
+------------------+
| column1          |
+------------------+
| {c0: foo, c1: 1} |
| {c0: bar, c1: 2} |
+------------------+
2 rows in set. Query took 0.001 seconds.

❯ copy (select * from t) to '/tmp/foo.parquet';
thread 'tokio-runtime-worker' panicked at /Users/andrewlamb/.cargo/registry/src/index.crates.io-6f17d22bba15001f/parquet-49.0.0/src/arrow/arrow_writer/byte_array.rs:441:9:
internal error: entered unreachable code: cannot downcast Int64 to byte array
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Expected behavior

I expect the parquet file to be written successfully. This works fine with JSON:

$ datafusion-cli
DataFusion CLI v34.0.0
❯ create table t as values (struct ('foo', 1)), (struct('bar', 2));
0 rows in set. Query took 0.003 seconds.

❯ select * from t;
+------------------+
| column1          |
+------------------+
| {c0: foo, c1: 1} |
| {c0: bar, c1: 2} |
+------------------+
2 rows in set. Query took 0.001 seconds.

❯ copy (select * from t) to '/tmp/foo.json';
+-------+
| count |
+-------+
| 2     |
+-------+
1 row in set. Query took 0.010 seconds.

❯
\q
$ cat /tmp/foo.json
{"column1":{"c0":"foo","c1":1}}
{"column1":{"c0":"bar","c1":2}}

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions