Skip to content

fix: copy sparse matrices#615

Merged
maartenbreddels merged 1 commit intomasterfrom
fix_sparse_copy
Feb 25, 2020
Merged

fix: copy sparse matrices#615
maartenbreddels merged 1 commit intomasterfrom
fix_sparse_copy

Conversation

@maartenbreddels
Copy link
Copy Markdown
Member

Closes #556

@xdssio
Copy link
Copy Markdown
Collaborator

xdssio commented Feb 26, 2020

This is the use case I am looking at:

This example failes, becuase some of the text is the words "is" and "try" which can not be columns.

from sklearn.feature_extraction.text import TfidfVectorizer
import vaex
ds = vaex.from_arrays(x=['I try this','but this is hard','I should buy a cat'])
tdfidf = TfidfVectorizer()
s = tdfidf.fit_transform(ds.x)
ds.add_columns(tdfidf.get_feature_names(), s)

Another example which will crash also because columns can't start with a digit:

ds = vaex.from_arrays(x=['I try this','this 1and and 2and','I should buy a cat'])

@maartenbreddels
Copy link
Copy Markdown
Member Author

Yes, those a new bugs, i have fixes for them.

ds = vaex.from_arrays(x=['I try this','this 1and and 2and','I should buy a cat'])

This just works right?

@maartenbreddels maartenbreddels deleted the fix_sparse_copy branch May 11, 2020 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support sparse data

2 participants