-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[DataFrame] ValueError while indexing a dataframe #1826
Description
System information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Sierra (10.12.6)
- Ray installed from (source or binary): source
- Ray version: 0.4.0
- Python version: Python 3.6.4
- Exact command to reproduce:
noise_complaints = complaints[complaints['Complaint Type'] == "Noise - Street/Sidewalk"]
Describe the problem
While applying indexing on a dataframe using ray, I encounter a ValueError whereas using pandas same command works.
for ref: noise_complaints = complaints[complaints['Complaint Type'] == "Noise - Street/Sidewalk"]
Source code / logs
To reproduce the error, please follow below steps,
- please go to this notebook -- http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.1/cookbook/Chapter%203%20-%20Which%20borough%20has%20the%20most%20noise%20complaints%3F%20%28or%2C%20more%20selecting%20data%29.ipynb
- replace pandas with ray in Cell 1
- execute the Cell 4
ValueError Traceback (most recent call last)
in ()
----> 1 noise_complaints = complaints[complaints['Complaint Type'] == "Noise - Street/Sidewalk"]
/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in getitem(self, key)
2744 """
2745 result_column_chunks = self._map_partitions(
-> 2746 lambda df: df.getitem(key))
2747 return to_pandas(result_column_chunks)
2748
/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in _map_partitions(self, func, index)
234 index = self.index
235
--> 236 return DataFrame(new_df, self.columns, index=index)
237
238 def _update_inplace(self, df=None, columns=None, index=None):
/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in init(self, df, columns, index)
51
52 if index is not None:
---> 53 self.index = index
54
55 def str(self):
/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in _set_index(self, new_index)
82 new_index: The new index to set this
83 """
---> 84 self._index.index = new_index
85
86 index = property(_get_index, _set_index)
/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in setattr(self, name, value)
3625 try:
3626 object.getattribute(self, name)
-> 3627 return object.setattr(self, name, value)
3628 except AttributeError:
3629 pass
pandas/_libs/properties.pyx in pandas._libs.properties.AxisProperty.set()
/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
557
558 def _set_axis(self, axis, labels):
--> 559 self._data.set_axis(axis, labels)
560 self._clear_item_cache()
561
/usr/local/lib/python3.6/site-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
3072 raise ValueError('Length mismatch: Expected axis has %d elements, '
3073 'new values have %d elements' %
-> 3074 (old_len, new_len))
3075
3076 self.axes[axis] = new_labels
ValueError: Length mismatch: Expected axis has 1380 elements, new values have 111069 elements