Skip to content

[MRG] Fix 'SparseSeries deprecated: scipy-dev failing on travis' #14002#14005

Closed
wenhaoz-fengcai wants to merge 3 commits intoscikit-learn:masterfrom
wenhaoz-fengcai:fix_SparseSeries_deprecated
Closed

[MRG] Fix 'SparseSeries deprecated: scipy-dev failing on travis' #14002#14005
wenhaoz-fengcai wants to merge 3 commits intoscikit-learn:masterfrom
wenhaoz-fengcai:fix_SparseSeries_deprecated

Conversation

@wenhaoz-fengcai
Copy link
Copy Markdown

Reference Issues/PRs

Fixes #14002
Issue: SparseSeries deprecated: scipy-dev failing on travis

What does this implement/fix? Explain your changes.

Use a Series with sparse values instead instead of SparseSeries.

Any other comments?

@wenhaoz-fengcai wenhaoz-fengcai changed the title Fix 'SparseSeries deprecated: scipy-dev failing on travis' #14002 [WIP]Fix 'SparseSeries deprecated: scipy-dev failing on travis' #14002 May 31, 2019
@wenhaoz-fengcai wenhaoz-fengcai changed the title [WIP]Fix 'SparseSeries deprecated: scipy-dev failing on travis' #14002 [MRG] Fix 'SparseSeries deprecated: scipy-dev failing on travis' #14002 Jun 1, 2019
Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>
@wenhaoz-fengcai
Copy link
Copy Markdown
Author

I'm not sure why codecov/patch failed on this commit

@glemaitre
Copy link
Copy Markdown
Member

I'm not sure why codecov/patch failed on this commit

The build which using pandas is failing on Azure. You should check if there is a change of behaviour with the new code (maybe we need to change the error message). The codecov failure is due to the Azure failure.

@thomasjpfan
Copy link
Copy Markdown
Member

We originally did not support pd.SparseArray because of: #7352 (comment) But it looks like its been fixed in pandas: pandas-dev/pandas#22325 and the original issue with pd.SparseSeries is gone.

import pandas as pd
import numpy as np

pd.__version__
# 0.24.2

ss1 = pd.Series(pd.SparseArray([1, 0, 2, 1, 0]))
ss2 = pd.SparseSeries([1, 0, 2, 1, 0])

np.asarray(ss1)
# array([1, 0, 2, 1, 0])

np.asarray(ss2)
# array([1, 0, 2, 1, 0])

This was fixed in pandas version 0.24.

@wenhaoz-fengcai
Copy link
Copy Markdown
Author

Ok, I’ll close this PR

@rth
Copy link
Copy Markdown
Member

rth commented Jun 11, 2019

Cron is still failing on master. I think this should be re-opened if only to ignore the future warning in test_type_of_target.

@jnothman jnothman reopened this Jun 13, 2019
@thomasjpfan
Copy link
Copy Markdown
Member

thomasjpfan commented Jun 13, 2019

We can support pandas sparse arrays as of pandas 0.24. This means type_of_target does not need to error for pandas > 0.24 on sparse arrays. But technically we still need to raise for pandas <= 0.23. One way to do this is to check pandas version and raise accordingly.

@jorisvandenbossche
Copy link
Copy Markdown
Member

@thomasjpfan be careful with the example, because the default fill value in pandas is np.nan and not 0 (for better or worse ...). So the correct example would be with nans (or by specifying 0 as the fill value):

with pandas 0.22

a = pd.SparseArray([1, np.nan, 2, 1, np.nan])

np.array(a)
# array([1., 2., 1.])

np.array(pd.SparseSeries(a))
# array([1., 2., 1.])

np.array(pd.Series(a))
# array([ 1., nan,  2.,  1., nan])

with pandas 0.24

np.array(a)                    
# array([ 1., nan,  2.,  1., nan])

np.array(pd.SparseSeries(a))                      
# array([ 1., nan,  2.,  1., nan])

np.array(pd.Series(a))         
# array([ 1., nan,  2.,  1., nan])

(so apparently even before 0.24, a Series (not SparseSeries) had the correct behaviour)

@jorisvandenbossche
Copy link
Copy Markdown
Member

I suppose the original check for SparseSeries was there to have a more informative error message (as I can imagine that if the y labels at once became a different length, that might have been confusing). If that is the case, I would indeed keep the check as is but only do it for pandas <= 0.23, as @thomasjpfan suggests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SparseSeries deprecated: scipy-dev failing on travis

6 participants