Proposed correction for category_values()#1476
Conversation
|
Hi @AlenkaF ! Thanks for catching this! I had a quick look and there seem like there are a few things wrong here (around the I have to say - i am not sure what the I also noticed that
You are welcome to have a look at any of these points! |
|
Hi @JovanVeljanoski! Thank you for your quick reply.
In case of categorical column it makes sense IMO to have
I would be happy to try and fix One more question: |
|
Ah yeah, I forgot to mention.. all the bugs you have found should be exposed as unit-tests before they are fixed :) (so we can develop forward without fear). There is a method to
Regarding this, I'd like the opinion of @maartenbreddels since I am not 100% sure how or if these methods are used internally. Of course you can check this yourself (i'm worried about breaking changes). Yeah if you keep to this topic around categorization, i think a single PR is fine, all these things are related. |
Oh, sorry about that. Donno how I missed it. Will do.
If I understand correctly
I will check for internal use of the method and see if I find anything. Thank you for your help! |
|
In vaex (unlike in say pandas and other dataframe libraries), an expression (or a column) is always coupled to a particular dataframe. You can't have "free floating" expressions, that independent of any dataframe. For example this core works: import vaex
df = vaex.example()
x = df.x
print(x.df.is_category('x'))If for example you'd like to have a function that only takes an expression(s) as an argument and you don't want to pass the dataframe around explicitly. Feel free to ask any questions! (here or in slack). |
|
Deleted the branch by accident and the PR got closed. Reopening it now. |
369423b to
6927b35
Compare
Thi PR is a proposed correction for
category_values()function on the dataframe class. When usingdf.category_values(column)an error accurs while accessing a list item that doesn't exist (list_category, item'values')gives the list as an output
and calling
category_values()functionresult in an error
I suggest using
Versions used:self.columns[column]instead ofself._categories[column]['values']in thecategory_valuesfunction.Python 3.9.5, vaex-core: 4.3.0.post1, vaex-viz: 0.5.0, vaex-hdf5: 0.8.0, vaex-server: 0.5.0, vaex-astro: 0.8.2, vaex-jupyter: 0.6.0, vaex-ml: 0.12.0