Skip to content

dataframe support for dot, cov and corr #1000

@DSLituiev

Description

@DSLituiev

Running above mentioned matrix operations on dask.dataframe of floats induces errors, e.g. dot

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/home/dima/data/external/pandas/pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3869)()
    156         try:
--> 157             return self.mapping.get_item(val)
    158         except TypeError:

/home/dima/data/external/pandas/pandas/hashtable.pyx in pandas.hashtable.Int64HashTable.get_item (pandas/hashtable.c:6503)()
    302 
--> 303     cpdef get_item(self, int64_t val):
    304         cdef khiter_t k

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.4/dist-packages/dask/dataframe/core.py in __getattr__(self, key)
   1309         try:
-> 1310             return self[key]
   1311         except KeyError as e:

/usr/local/lib/python3.4/dist-packages/dask/dataframe/core.py in __getitem__(self, key)
   1274             # error is raised from pandas
-> 1275             dummy = self._pd[_extract_pd(key)]
   1276 

/home/dima/data/external/pandas/pandas/core/frame.py in __getitem__(self, key)
   1979         else:
-> 1980             return self._getitem_column(key)
   1981 

/home/dima/data/external/pandas/pandas/core/frame.py in _getitem_column(self, key)
   1986         if self.columns.is_unique:
-> 1987             return self._get_item_cache(key)
   1988 

/home/dima/data/external/pandas/pandas/core/generic.py in _get_item_cache(self, item)
   1090         if res is None:
-> 1091             values = self._data.get(item)
   1092             res = self._box_item_values(item, values)

/home/dima/data/external/pandas/pandas/core/internals.py in get(self, item, fastpath)
   3193             if not isnull(item):
-> 3194                 loc = self.items.get_loc(item)
   3195             else:

/home/dima/data/external/pandas/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
   1863             key = _values_from_object(key)
-> 1864             return self._engine.get_loc(key)
   1865 

/home/dima/data/external/pandas/pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4005)()
    136 
--> 137     cpdef get_loc(self, object val):
    138         if is_definitely_invalid_key(val):

/home/dima/data/external/pandas/pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3934)()
    158         except TypeError:
--> 159             raise KeyError(val)
    160 

KeyError: 'dot'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-59-2983061be677> in <module>()
----> 1 gtcov = gtdf.dot( gtdf.T)

/usr/local/lib/python3.4/dist-packages/dask/dataframe/core.py in __getattr__(self, key)
   1310             return self[key]
   1311         except KeyError as e:
-> 1312             raise AttributeError(e)
   1313 
   1314     def __dir__(self):

AttributeError: 'dot'

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions