-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Milestone
Description
These seem to break with unknown divisions (although perhaps this is appropriate):
>>> df.tip_amount.corr(df.payment_type == 2).compute()
/opt/anaconda/lib/python2.7/site-packages/dask/dataframe/core.pyc in corr(self, other, method, min_periods)
1235 raise NotImplementedError("Only Pearson correlation has been "
1236 "implemented")
-> 1237 df = concat([self, other], axis=1)
1238 return cov_corr(df, min_periods, corr=True, scalar=True)
1239
/opt/anaconda/lib/python2.7/site-packages/dask/dataframe/multi.pyc in concat(dfs, axis, join, interleave_partitions)
552 else:
553 if axis == 1:
--> 554 raise ValueError('Unable to concatenate DataFrame with unknown '
555 'division specifying axis=1')
556 else:
ValueError: Unable to concatenate DataFrame with unknown division specifying axis=1Also I seem to be getting wrong results:
>>> df[['tip_amount', 'payment_type']].corr().compute()
tip_amount payment_type
tip_amount 0.999169 -0.033852
payment_type -0.033852 8.420302
>>> df.head(1000)[['tip_amount', 'payment_type']].corr()
tip_amount payment_type
tip_amount 1.000000 -0.559584
payment_type -0.559584 1.000000Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels