GH-43454: [C++][Python] Add Opaque canonical extension type#43458
GH-43454: [C++][Python] Add Opaque canonical extension type#43458lidavidm merged 3 commits intoapache:mainfrom
Conversation
|
|
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Copying my review comments from #41823 (review)
For the Python bindings, looks good as well, just a few minor comments:
- We probably want to add
OpaqueScalartopyarrow/__init__.pyas well? (although I see we forgot this for the tensor type, which I assume you have been imitating) - Can you add the three new classes to
test_extension_type_constructor_errorsintest_misc.py? - Can you list the new type in the python API docs (docs/source/python/api/datatypes.rst, docs/source/python/api/arrays.rst
)? See eg #39652 for a previous PR adding a new type where this is added to the docs.
|
I believe I've fixed everything here, thanks for the reviews! |
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Haven't looked in detail at the C++ side again, but the Python changes look good!
|
Any further comments here? |
None for me. I based some of my work on Bool8 on the implementation here, so ideally we could get another C++ reviewer to approve as a third party. |
There was a problem hiding this comment.
Perhaps also compare with an extension type that is not Opaque?
Co-authored-by: Weston Pace <weston.pace@gmail.com>
|
I'm going to merge this in a bit given the radio silence |
|
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 6e7125b. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 38 possible false positives for unstable benchmarks that are known to sometimes produce them. |
### Rationale for this change C++ and Python implementations of #43234 ### What changes are included in this PR? - Implement C++ `Bool8Type`, `Bool8Array`, `Bool8Scalar`, and tests - Implement Python bindings to C++, as well as zero-copy numpy conversion methods - TODO: docs waiting for rebase on #43458 ### Are these changes tested? Yes ### Are there any user-facing changes? Bool8 extension type will be available in C++ and Python libraries * GitHub Issue: #17682 Authored-by: Joel Lubinitsky <joellubi@gmail.com> Signed-off-by: Felipe Oliveira Carvalho <felipekde@gmail.com>
|
Apologies for the late question. Is opaque type intentionally not registered? arrow/cpp/src/arrow/extension_type.cc Lines 145 to 157 in f078942 If it's an oversight I can add registration in #37298. |
|
It was an oversight, thanks for spotting it |
Let's do that as a separate PR (it should also require to update the tests slightly, and then it doesn't depend on the timeline of merging the other PR) |
|
Opened #43787 to keep track of it. |
…3788) This is to resolve #43787 > The Opaque extension type implementation for C++ (plus python bindings) was added in #43458, but it was not registered by default, which we should do for canonical extension types (see #43458 (comment)) Additionally, this adds `bool8` extension type builds with `ARROW_JSON=false` as discussed [here](5258819#r145613657) ### Rationale for this change Canonical types should be registered by default if possible (except e.g. if they can't be compiled due to `ARROW_JSON=false`). ### What changes are included in this PR? This adds default registration for `opaque`, changes when `bool8` is built and moves all canonical tests under the same test target. ### Are these changes tested? Changes are tested by previously existing tests. ### Are there any user-facing changes? `opaue` will now be registered by default and `bool8` will be present in case `ARROW_JSON=false` at build time. * GitHub Issue: #43787 Authored-by: Rok Mihevc <rok@mihevc.org> Signed-off-by: Rok Mihevc <rok@mihevc.org>
…3788) This is to resolve #43787 > The Opaque extension type implementation for C++ (plus python bindings) was added in apache/arrow#43458, but it was not registered by default, which we should do for canonical extension types (see apache/arrow#43458 (comment)) Additionally, this adds `bool8` extension type builds with `ARROW_JSON=false` as discussed [here](apache/arrow@441ecb0#r145613657) ### Rationale for this change Canonical types should be registered by default if possible (except e.g. if they can't be compiled due to `ARROW_JSON=false`). ### What changes are included in this PR? This adds default registration for `opaque`, changes when `bool8` is built and moves all canonical tests under the same test target. ### Are these changes tested? Changes are tested by previously existing tests. ### Are there any user-facing changes? `opaue` will now be registered by default and `bool8` will be present in case `ARROW_JSON=false` at build time. * GitHub Issue: #43787 Authored-by: Rok Mihevc <rok@mihevc.org> Signed-off-by: Rok Mihevc <rok@mihevc.org>
Rationale for this change
Add the newly ratified extension type.
What changes are included in this PR?
The C++/Python implementation only.
Are these changes tested?
Yes
Are there any user-facing changes?
No.