[DataFrame] Changing the _default_index fn to a remote function by kunalgosar · Pull Request #1617 · ray-project/ray

kunalgosar · 2018-02-27T01:44:07Z

What do these changes do?

Moving _default_index to a remote function speeds up creating a new DataFrame. Since _default_index will now return a futures object, the main thread is freed and returned to the user much quicker. This does not necessarily mean that the full computation has finished, but the main thread can continue running.

Updated Performance on Query against Pandas

Data: 76 MB of String Data
Machine: 2 Core Macbook
Partitions: 4

Pandas Benchmark:
%timeit pandas_df.query(query_func) # 172 ms

Ray:
%timeit ray_df.query(query_func) # 15.2 ms

AmplabJenkins · 2018-02-27T02:46:16Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/3984/
Test PASSed.

devin-petersohn

Great optimization. Thanks @kunalgosar

devin-petersohn · 2018-02-27T05:11:49Z

Passed the private-travis. OK to merge.

moved _default_index to remote fn

d18c00d

kunalgosar mentioned this pull request Feb 27, 2018

Changing the _default_index fn to a remote function #1615

Closed

devin-petersohn changed the title ~~Changing the _default_index fn to a remote function~~ [DataFrame] Changing the _default_index fn to a remote function Feb 27, 2018

devin-petersohn approved these changes Feb 27, 2018

View reviewed changes

devin-petersohn merged commit f43328f into ray-project:master Feb 27, 2018

kunalgosar deleted the _index branch March 14, 2018 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DataFrame] Changing the _default_index fn to a remote function#1617

[DataFrame] Changing the _default_index fn to a remote function#1617
devin-petersohn merged 1 commit intoray-project:masterfrom
kunalgosar:_index

kunalgosar commented Feb 27, 2018

Uh oh!

AmplabJenkins commented Feb 27, 2018

Uh oh!

devin-petersohn left a comment

Uh oh!

devin-petersohn commented Feb 27, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kunalgosar commented Feb 27, 2018

What do these changes do?

Updated Performance on Query against Pandas

Uh oh!

AmplabJenkins commented Feb 27, 2018

Uh oh!

devin-petersohn left a comment

Choose a reason for hiding this comment

Uh oh!

devin-petersohn commented Feb 27, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants