GH-33500: [Python] add Table.to/from_struct_array#38520
GH-33500: [Python] add Table.to/from_struct_array#38520jorisvandenbossche merged 12 commits intoapache:mainfrom
Table.to/from_struct_array#38520Conversation
|
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename the pull request title in the following format? or In the case of PARQUET issues on JIRA the title also supports: See also: |
Table.to/from_struct_arrayTable.to/from_struct_array
|
|
3739dc8 to
9cd3212
Compare
Table.to/from_struct_arrayTable.to/from_struct_array
|
|
Table.to/from_struct_arrayTable.to/from_struct_array
|
|
9cd3212 to
d77a509
Compare
d77a509 to
4f547ca
Compare
|
@jorisvandenbossche I think this is ready for a review whenever you have the time. |
|
@kou Do you know if there is someone else that I could request a review from? It'd be good to get this merged if possible. |
| def test_table_from_struct_array(): | ||
| struct_array = pa.array( | ||
| [{"ints": 1}, {"floats": 1.0}], | ||
| type=pa.struct([("ints", pa.int32()), ("floats", pa.float32())]), | ||
| ) | ||
| result = pa.Table.from_struct_array(struct_array) | ||
| assert result.equals(pa.Table.from_arrays( | ||
| [ | ||
| pa.array([1, None], type=pa.int32()), | ||
| pa.array([None, 1.0], type=pa.float32()), | ||
| ], ["ints", "floats"] | ||
| )) |
There was a problem hiding this comment.
Can we add test_table_from_struct_array_chunked_array() like this?
| def test_table_to_struct_array(): | ||
| table = pa.Table.from_arrays( | ||
| [ | ||
| pa.array([1, None], type=pa.int32()), | ||
| pa.array([None, 1.0], type=pa.float32()), | ||
| ], ["ints", "floats"] | ||
| ) | ||
| result = table.to_struct_array() | ||
| assert result.equals(pa.chunked_array( | ||
| pa.array( | ||
| [{"ints": 1}, {"floats": 1.0}], | ||
| type=pa.struct([("ints", pa.int32()), ("floats", pa.float32())]), | ||
| ), | ||
| )) |
There was a problem hiding this comment.
Can we also add a test with max_chunksize?
AlenkaF
left a comment
There was a problem hiding this comment.
Thanks for adding this!
Just one nit comment from me, otherwise it looks good to me 👍
52a792a to
93c54d3
Compare
python/pyarrow/table.pxi
Outdated
|
|
||
| def to_struct_array(self, max_chunksize=None): | ||
| """ | ||
| Convert to a struct array. |
There was a problem hiding this comment.
Maybe we can be more explicit that it will be a chunked array of struct type?
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
|
Thanks @judahrand! |
|
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 60b89ff. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 3 possible false positives for unstable benchmarks that are known to sometimes produce them. |
### Rationale for this change ### What changes are included in this PR? ### Are these changes tested? ### Are there any user-facing changes? * Closes: apache#33500 Lead-authored-by: Judah Rand <17158624+judahrand@users.noreply.github.com> Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com> Signed-off-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?