Skip to content

[DataFrame] Implement IO for ray_df#1599

Merged
devin-petersohn merged 29 commits intoray-project:masterfrom
simon-mo:df_io
Feb 27, 2018
Merged

[DataFrame] Implement IO for ray_df#1599
devin-petersohn merged 29 commits intoray-project:masterfrom
simon-mo:df_io

Conversation

@simon-mo
Copy link
Copy Markdown
Contributor

@simon-mo simon-mo commented Feb 24, 2018

What do these changes do?

This PR adds support for

  • read_csv
  • read_parquet

Related issue number

None

Note

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3930/
Test PASSed.

@simon-mo
Copy link
Copy Markdown
Contributor Author

This PR is ready for review now.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3947/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3972/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3973/
Test PASSed.

Copy link
Copy Markdown
Member

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just a couple of comments.

# reindex here because we want a pd.RangeIndex within the partitions.
# It is smaller and sometimes faster.
t_df.reindex()
# reset_index here because we want a pd.RangeIndex
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch! Thanks @simon-mo.

level=level,
numeric_only=numeric_only),
index=temp_index)._df))
# collapsed_df.index = self.columns
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be left in?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Removing it

@simon-mo simon-mo mentioned this pull request Feb 26, 2018
3 tasks
@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3974/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3975/
Test PASSed.

@devin-petersohn
Copy link
Copy Markdown
Member

Passed private-travis build. OK to merge. Thanks @simon-mo!

@devin-petersohn devin-petersohn merged commit d78a22f into ray-project:master Feb 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants