-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-9967: [Python] Add compute module docs + expose more option classes #8145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? Then could you also rename pull request title in the following format? See also: |
edb57cc to
3f99ae8
Compare
docs/source/cpp/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed that this was missing on the c++ page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thank you! However, can you leave the table alphabetically-sorted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually my bad, you already had it there. Reverted
docs/source/python/api/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not really a transform?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fair. Would Structural Transforms be a good place for it?
docs/source/python/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this array is not used? (maybe show pc.equals(a, b) ?)
e79559c to
d117ed1
Compare
python/pyarrow/compute.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we can't have methods named and and or in Python. For now I went with and_ and or_ here, for consistency with and_kleene and or_kleene, but for example numpy uses np.logical_and and np.logical_or
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a function named "and":
>>> from pyarrow import compute
>>> getattr(compute, "and")
<function pyarrow.compute.and(left, right, *, memory_pool=None)>So you should just fix compute.py so that it's exported as and_.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Fixed now
docs/source/python/api/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this render the docstrings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. It's what's done on the other pages:
https://github.com/apache/arrow/blob/master/docs/source/python/api/datatypes.rst
I'm having trouble building the docs locally (related to ARROW-10018 I think) so haven't been able to check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't render the docstring here, but it will create a separate page for each of those functions and link to that from this table.
(the separate pages might be a bit overkill since the docstrings here are often not that informative yet, but it's indeed how we do it for other functions as well)
c2445c1 to
c1c5846
Compare
|
Question: in 952649d I'm attempting to expose This is needed to make the |
|
Cython generally doesn't convert implicitly between arbitrary Python objects and C/C++ types, you have to type your code explicitly. Also, it is better to use def __cinit__(self, int64_t pivot):
self.partition_nth_options.reset(new CPartitionNthOptions(pivot)) |
Thanks @pitrou! That was the problem |
|
This is ready for re-review. I believe that I've addressed the feedback from previous reviews. I've also now exposed all the option classes so that all the kernels listed in the docs are usable from Python. |
docs/source/python/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche Will this work? I'm not 100% sure as I haven't yet compiled the docs to explicitly check
python/pyarrow/_compute.pyx
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why unique_ptr is used in some of the other classes, but it shouldn't be necessary. See MinMaxOptions for an example without.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In PartitionNthOptions we need it because it doesn't have a null constructor so it can't be allocated on the stack. That's not the case for VarianceOptions, though, so I'll take a look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rewritten VarianceOptions without unique_ptr.
I think that with the other three options classes exposed in this PR (SetLookupOptions, PartitionNthOptions and StrptimeOptions) it's necessary to use unique_ptr or, alternatively, we would need to define default constructors
|
@github-actions crossbow submit test-ubuntu-18.04-docs |
|
Revision: 73272bf38f2d8406b60a6ebaa20c5f4e58dff075 Submitted crossbow builds: ursa-labs/crossbow @ actions-615
|
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Will merge after a couple changes. Thank you @arw2019 !
|
CI failures are unrelated. |
|
Thanks @pitrou @jorisvandenbossche for reviewing! |
#8163 exposes
pyarrow.computekernels and generates their docstrings. This PR adds documentation for the module in the User Guide and the Python API reference.