-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-12060: [Python] Enable calling compute functions on Expressions #11918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-12060: [Python] Enable calling compute functions on Expressions #11918
Conversation
|
|
|
This makes the following already possible (to use in filter or projection expression in dataset): |
9b8a103 to
3b4b0fc
Compare
python/pyarrow/_compute.pyx
Outdated
| c_arguments.push_back((<Expression> argument).expr) | ||
|
|
||
| # if options is not None: | ||
| # c_options = make_shared[CFunctionOptions](options.get_options()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given a python FunctionOptions class object, I need to get a shared pointer to the C++ object (get_options() returns a const CFunctionOptions*, but CmakeCallExpressions needs a shared_ptr[CFunctionOptions]).
But I haven't directly found how to do that conversion (cc @pitrou)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should probably replace the unique_ptr in the FunctionOptions declaration with a shared_ptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DeserializeFunctionOptions in C++ (in function_internal.h, and several other methods there) returns a unique_ptr(from #10511). Can that be changed there as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can, but I don't think it's necessary on the C++ side. On the Python side, you can convert the unique_ptr into a shared_ptr (by moving it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I did in the meantime. The only difficulty I ran into is that moving a uniqe_ptr to shared_ptr doesn't seem to work in cython, but with our helper to_shared it worked.
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat!
python/pyarrow/_compute.pyx
Outdated
| cdef: | ||
| vector[CExpression] c_arguments | ||
| shared_ptr[CFunctionOptions] c_options=( | ||
| <shared_ptr[CFunctionOptions]> nullptr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to explicitly initialize this, do you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, passing an unitialized c_options to CMakeCallExpression seems to work as well
|
Benchmark runs are scheduled for baseline = 18bedd3 and contender = 17bf54a. 17bf54a is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
No description provided.