Skip to content

[DataFrame] Changing the _default_index fn to a remote function#1617

Merged
devin-petersohn merged 1 commit intoray-project:masterfrom
kunalgosar:_index
Feb 27, 2018
Merged

[DataFrame] Changing the _default_index fn to a remote function#1617
devin-petersohn merged 1 commit intoray-project:masterfrom
kunalgosar:_index

Conversation

@kunalgosar
Copy link
Copy Markdown
Contributor

What do these changes do?

Moving _default_index to a remote function speeds up creating a new DataFrame. Since _default_index will now return a futures object, the main thread is freed and returned to the user much quicker. This does not necessarily mean that the full computation has finished, but the main thread can continue running.

Updated Performance on Query against Pandas

Data: 76 MB of String Data
Machine: 2 Core Macbook
Partitions: 4

Pandas Benchmark:
%timeit pandas_df.query(query_func) # 172 ms

Ray:
%timeit ray_df.query(query_func) # 15.2 ms

@devin-petersohn devin-petersohn changed the title Changing the _default_index fn to a remote function [DataFrame] Changing the _default_index fn to a remote function Feb 27, 2018
@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3984/
Test PASSed.

Copy link
Copy Markdown
Member

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great optimization. Thanks @kunalgosar

@devin-petersohn
Copy link
Copy Markdown
Member

Passed the private-travis. OK to merge.

@devin-petersohn devin-petersohn merged commit f43328f into ray-project:master Feb 27, 2018
@kunalgosar kunalgosar deleted the _index branch March 14, 2018 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants