Skip to content

[DataFrame] ValueError while indexing a dataframe #1826

@tanaysd

Description

@tanaysd

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Sierra (10.12.6)
  • Ray installed from (source or binary): source
  • Ray version: 0.4.0
  • Python version: Python 3.6.4
  • Exact command to reproduce:
    noise_complaints = complaints[complaints['Complaint Type'] == "Noise - Street/Sidewalk"]

Describe the problem

While applying indexing on a dataframe using ray, I encounter a ValueError whereas using pandas same command works.

for ref: noise_complaints = complaints[complaints['Complaint Type'] == "Noise - Street/Sidewalk"]

Source code / logs

To reproduce the error, please follow below steps,

  1. please go to this notebook -- http://nbviewer.jupyter.org/github/jvns/pandas-cookbook/blob/v0.1/cookbook/Chapter%203%20-%20Which%20borough%20has%20the%20most%20noise%20complaints%3F%20%28or%2C%20more%20selecting%20data%29.ipynb
  2. replace pandas with ray in Cell 1
  3. execute the Cell 4

ValueError Traceback (most recent call last)
in ()
----> 1 noise_complaints = complaints[complaints['Complaint Type'] == "Noise - Street/Sidewalk"]

/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in getitem(self, key)
2744 """
2745 result_column_chunks = self._map_partitions(
-> 2746 lambda df: df.getitem(key))
2747 return to_pandas(result_column_chunks)
2748

/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in _map_partitions(self, func, index)
234 index = self.index
235
--> 236 return DataFrame(new_df, self.columns, index=index)
237
238 def _update_inplace(self, df=None, columns=None, index=None):

/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in init(self, df, columns, index)
51
52 if index is not None:
---> 53 self.index = index
54
55 def str(self):

/usr/local/lib/python3.6/site-packages/ray/dataframe/dataframe.py in _set_index(self, new_index)
82 new_index: The new index to set this
83 """
---> 84 self._index.index = new_index
85
86 index = property(_get_index, _set_index)

/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in setattr(self, name, value)
3625 try:
3626 object.getattribute(self, name)
-> 3627 return object.setattr(self, name, value)
3628 except AttributeError:
3629 pass

pandas/_libs/properties.pyx in pandas._libs.properties.AxisProperty.set()

/usr/local/lib/python3.6/site-packages/pandas/core/generic.py in _set_axis(self, axis, labels)
557
558 def _set_axis(self, axis, labels):
--> 559 self._data.set_axis(axis, labels)
560 self._clear_item_cache()
561

/usr/local/lib/python3.6/site-packages/pandas/core/internals.py in set_axis(self, axis, new_labels)
3072 raise ValueError('Length mismatch: Expected axis has %d elements, '
3073 'new values have %d elements' %
-> 3074 (old_len, new_len))
3075
3076 self.axes[axis] = new_labels

ValueError: Length mismatch: Expected axis has 1380 elements, new values have 111069 elements

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions