Skip to content

[Core] Remove digests in plasma (4x performance improvement)#8980

Merged
suquark merged 2 commits intoray-project:masterfrom
suquark:remove_digests
Jun 22, 2020
Merged

[Core] Remove digests in plasma (4x performance improvement)#8980
suquark merged 2 commits intoray-project:masterfrom
suquark:remove_digests

Conversation

@suquark
Copy link
Copy Markdown
Member

@suquark suquark commented Jun 17, 2020

Why are these changes needed?

Digests were used in plasma to distinguish objects in case of ObjectID overlapping (although we are not using it). Now our way of generating ObjectID should ensure that different ObjectID points to different objects.

However, computing the digests of large objects is the main overhead and is the main cause of high CPU contention. It also increases the size of IPC calls. This PR removes digests in plasma in change of much better performance. The List function call is also removed because it couples with digest and we are not using it.

This PR increases the large object put throughput performance to 4.4x (single client) and 1.55x (multiple clients)

Performance Comparison:

Before:

single client get calls per second 9293.56 +- 26.26
single client put calls per second 5346.88 +- 129.31
**single client put gigabytes per second 5.01 +- 0.68**
multi client put calls per second 10145.61 +- 226.07
**multi client put gigabytes per second 13.55 +- 5.89**
single client tasks sync per second 1103.01 +- 27.79
single client tasks async per second 14209.19 +- 614.51
multi client tasks async per second 40815.41 +- 2308.02
1:1 actor calls sync per second 1674.11 +- 63.74
1:1 actor calls async per second 6616.05 +- 72.97
1:1 actor calls concurrent per second 5667.39 +- 122.4
1:n actor calls async per second 10050.72 +- 91.84
n:n actor calls async per second 33732.59 +- 421.08
n:n actor calls with arg async per second 10781.66 +- 149.54
1:1 async-actor calls sync per second 1183.84 +- 28.45
1:1 async-actor calls async per second 3862.82 +- 119.24
1:1 async-actor calls with args async per second 2241.18 +- 37.79
1:n async-actor calls async per second 8993.51 +- 100.68
n:n async-actor calls async per second 25275.31 +- 223.06

After this PR:

single client get calls per second 9160.33 +- 176.46
single client put calls per second 5436.99 +- 115.41
**single client put gigabytes per second 22.26 +- 6.62**
multi client put calls per second 9908.4 +- 207.41
**multi client put gigabytes per second 20.99 +- 11.95**
single client tasks sync per second 1206.3 +- 38.53
single client tasks async per second 15651.57 +- 363.69
multi client tasks async per second 43883.96 +- 3129.57
1:1 actor calls sync per second 1648.85 +- 17.1
1:1 actor calls async per second 6880.17 +- 153.93
1:1 actor calls concurrent per second 5906.67 +- 153.06
1:n actor calls async per second 9955.94 +- 131.48
n:n actor calls async per second 34647.68 +- 421.49
n:n actor calls with arg async per second 10194.18 +- 169.31
1:1 async-actor calls sync per second 1193.76 +- 52.08
1:1 async-actor calls async per second 3760.37 +- 69.31
1:1 async-actor calls with args async per second 2264.01 +- 15.16
1:n async-actor calls async per second 8688.2 +- 149.09
n:n async-actor calls async per second 25395.45 +- 117.77

Related issue number

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/latest/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failure rates at https://ray-travis-tracker.herokuapp.com/.
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested (please justify below)

@AmplabJenkins
Copy link
Copy Markdown

Can one of the admins verify this patch?

@suquark suquark changed the title [WIP][Core] Remove digests in plasma [Core] Remove digests in plasma Jun 17, 2020
@suquark suquark changed the title [Core] Remove digests in plasma [Core] Remove digests in plasma (4x performance improvement) Jun 17, 2020
@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/27235/
Test PASSed.

@AmplabJenkins
Copy link
Copy Markdown

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/27234/
Test FAILed.

@suquark
Copy link
Copy Markdown
Member Author

suquark commented Jun 17, 2020

ready for review. the CI errors are not related.

@suquark suquark requested a review from ericl June 17, 2020 05:49
@robertnishihara
Copy link
Copy Markdown
Collaborator

robertnishihara commented Jun 17, 2020

Does this mean we can also get rid of the xxhash code (not necessarily in this PR)? Or is that still used somewhere.

@suquark
Copy link
Copy Markdown
Member Author

suquark commented Jun 17, 2020

@robertnishihara I am not sure if we remove xxhash, then we also have to remove plasma_client.Hash(). Maybe it is still useful to keep the ability to compute the hash?

And yes, we should remove unused plasma code later if refactoring is necessary.

@suquark suquark mentioned this pull request Jun 18, 2020
6 tasks
Copy link
Copy Markdown
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to merge if tests pass.

@suquark
Copy link
Copy Markdown
Member Author

suquark commented Jun 22, 2020

I re-triggered all tests to see if it works with our latest upstream.

@AmplabJenkins
Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/27426/
Test PASSed.

@suquark
Copy link
Copy Markdown
Member Author

suquark commented Jun 22, 2020

the CI errors are not related. I'll merge the PR.

@suquark suquark merged commit 7a110b9 into ray-project:master Jun 22, 2020
@suquark suquark deleted the remove_digests branch July 9, 2020 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants