[DNM] Performance changes to reduce false sharing - for discussion. by joemario · Pull Request #9431 · ceph/ceph

joemario · 2016-06-01T22:27:27Z

This commit demonstrates the benefit of reducing cacheline contention, in two ways:

Align classes and contended hot data members.
Demonstrate how expensive OpTracker::_mark_event can be.

This commit is not to be checked in as-is.
Although I am not a Ceph developer, I have been able to realize worthwhile performance gains from these changes. Ben England and I would like to share these findings at a upcoming weekly Ceph performance meeting.

1) First is aligning classes and contended hot data members. 2) Demonstrate the expense OpTracker::_mark_event can be, and discuss opportunities.

liewegas · 2017-03-28T18:25:09Z

@joemario I'd would love to resurrect this and get the simple changes (e.g., cacheline alignements) into mergeable patches...

bengland2 · 2017-03-28T23:06:12Z

I went back and found the presentation (because Mark indexed all the recordings, thank you Mark).

https://bluejeans.com/playback/s/ha97eh8zOC5OPKy63VBWM1XSIDCAs6UVV8dWWhu04RyvrRjOuEG13JZ6dEpOiczK
(use Chrome to play)

slides at:
http://people.redhat.com/jmario/ceph/false-sharing_findings.odp

Joe's primary finding was that the C++ iostream class usage in Ceph's logging code was the primary cause of cacheline contention, or "false sharing". Joe's PR was there so he could present an analysis of exactly where the problem was in the Ceph code, and was admittedly not ready to be pulled in as is. But a huge part of the work of performance optimization is analysis and Joe did just about all that was needed there IMHO. Is there a Ceph developer who could help? This probably needs to be broken up into commit(s) to remove the iostream problem and commit(s) to add the cacheline alignment to Ceph data structures. IMHO optimizations like this are extremely relevant to help Ceph keep up with faster storage and networks.

joemario · 2017-03-31T15:58:43Z

Sage: To follow up on Ben's reply earlier, here are the performance opportunities I saw: * The biggest slowdown I saw was from using the slow stringstream objects in the default logging path. It showed up mostly under mark_event(). I understand some Ceph benchmarking turns off the default logging in /etc/ceph/ceph.conf. Fixing this stringstream cost should get you most of the performance that you get by disabling that default logging. One of the C++ developers looked at that Ceph code and came up with ideas for how to replace stringstream. * The next biggest item was fixing the cpu cacheline false sharing. The PoC patches I submitted last year show how easy some of those are to fix. Although I saw only a 1-3% speedup in my small "1-disk/2-osd" setup, the speedup should be larger in a larger cpu-bound ceph setup. It will also be larger once the above stringstream problem is fixed. I'd be more than happy to work with someone to reprofile Ceph on a larger installation for false sharing and to show them all the source code locations that need to be modified. (I'm not a Ceph developer and don't want to take the time to learn the Ceph patch testing and submission process). Let me know if anyone is interested in working with me on this, and if you want me to connect you to the C++ compiler developer for his ideas on removing the stringstring slowdown. He's happy to consult. Joe

…

----- Original Message -----

From: "Sage Weil" ***@***.***> To: "ceph/ceph" ***@***.***> Cc: "joemario" ***@***.***>, "Mention" ***@***.***> Sent: Tuesday, March 28, 2017 2:25:26 PM Subject: Re: [ceph/ceph] [DNM] Performance changes to reduce false sharing - for discussion. (#9431) @joemario I'd would love to resurrect this and get the simple changes (e.g., cacheline alignements) into mergeable patches... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .

bengland2 · 2017-04-10T09:18:23Z

Joe's presentation in PDF format

code-with-amitk · 2017-08-21T11:08:16Z

src/common/TrackedOp.cc

 {
  if (!tracking_enabled)
    return;
+


@joemario Please keep PR title < 50 characters for future PRs, also prepend component name before PR in PR title.
https://github.com/ceph/ceph/blob/master/SubmittingPatches.rst#title-of-pull-requests-and-title-of-commits

liewegas · 2017-09-13T13:36:26Z

@aclamk do you mind looking at this and seeing if we can/should apply it?

aclamk · 2017-09-14T10:53:29Z

@liewegas
Idea is sound. Having a cache entry changed by one CPU and read by another is known to be slower.
Strategies like the one proposed here are likely to slightly improve performance.
Downside: it is difficult to measure precisely.

Implementation could use some polishing:

hot variables that are used together should go into one cache raw, for example
pthread_mutex_t m_queue_mutex attribute((aligned (64)));
pthread_t m_queue_mutex_holder attribute((aligned (64)));
should become a struct
I guess alignas should replace attribute((aligned))
we can't skip _mark_event method, can we?

Should I make adaptation of this for master?

joemario · 2017-09-14T11:28:08Z

Hi Adam and Sage: The "perf c2c" cacheline contention functionality got checked in upstream last winter, so it's now available. It's also in RHEL 7.4 and CentOS 7.4. If the setup is still on CentOS 7.3, the 7.4 version of perf can be copied back to 7.3 and it should work. (No guarantee, butIt always has for me).to to fin Since I did the work for that PR on a very small setup (single system ceph setup with one SSD), and since the ceph codebase has changed considerably, I recommend rerunning the perf c2c tool on the latest ceph codebase in a much larger environment. The larger environment will more clearly show the cost of cacheline contention. I'm not a ceph developer, but am intimately familiar with the perf c2c tool and interpreting its output. I'd be very happy to work with a ceph developer on running the tool and analyzing the results. The skipping of _mark_event in my PR was only to demonstrate the performance cost of using C++ iostream in the default logging path. One of our C++ compiler developers looked at the ceph logging implementation (_mark-event) and had recommendations for speeding it up. I can pull him in to work with someone if the interest is there. That _mark_event code path is the biggest bottleneck I found wrt to contention and will only get bigger on larger installations. I did try both grouping together and separating the two m_queue_mutex* structs you mention below. I had mixed results, probably due to my small setup. I agree with you that if they're used together, they can be grouped, but it should be tested first, as there are a few cases where that's not always true. I'm busy till mid October, but after that if you want to get a reasonably sized ceph installation (where the OSDs can be pushed hard), I'd be happy to work with you on rerunning and analyzing perf c2c to find the contention locations in the latest codebase and to attack the iostream logging issue. Joe

…

On Thu, Sep 14, 2017 at 6:53 AM, Adam Kupczyk ***@***.***> wrote: @liewegas <https://github.com/liewegas> Idea is sound. Having a cache entry changed by one CPU and read by another is known to be slower. Strategies like the one proposed here are likely to slightly improve performance. Downside: it is difficult to measure precisely. Implementation could use some polishing: 1. hot variables that are used together should go into one cache raw, for example pthread_mutex_t m_queue_mutex *attribute*((aligned (64))); pthread_t m_queue_mutex_holder *attribute*((aligned (64))); should become a struct 2. I guess alignas should replace *attribute*((aligned)) 3. we can't skip _mark_event method, can we? Should I make adaptation of this for master? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#9431 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AOQQyn7b6L69tX8e4U9a39RZMZwcf9zlks5siQW-gaJpZM4IsCbx> .

stale · 2018-10-18T09:53:55Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If you are a maintainer or core committer, please follow-up on this issue to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

Performance changes to reduce false sharing - for discussion.

7b27a1d

1) First is aligning classes and contended hot data members. 2) Demonstrate the expense OpTracker::_mark_event can be, and discuss opportunities.

tchaikov changed the title ~~Performance changes to reduce false sharing - for discussion.~~ [DNM] Performance changes to reduce false sharing - for discussion. Jun 2, 2016

tchaikov added the performance label Jun 2, 2016

code-with-amitk reviewed Aug 21, 2017

View reviewed changes

tchaikov self-requested a review September 14, 2017 11:39

aclamk mentioned this pull request Oct 5, 2017

common/log: Speed improvement for log. #17938

Closed

stale bot added the stale label Oct 18, 2018

batrick closed this Nov 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DNM] Performance changes to reduce false sharing - for discussion.#9431

[DNM] Performance changes to reduce false sharing - for discussion.#9431
joemario wants to merge 1 commit intoceph:hammerfrom
joemario:jpm_false_sharing

joemario commented Jun 1, 2016

Uh oh!

liewegas commented Mar 28, 2017

Uh oh!

bengland2 commented Mar 28, 2017

Uh oh!

joemario commented Mar 31, 2017 via email

Uh oh!

bengland2 commented Apr 10, 2017

Uh oh!

code-with-amitk Aug 21, 2017

Uh oh!

liewegas commented Sep 13, 2017

Uh oh!

aclamk commented Sep 14, 2017

Uh oh!

joemario commented Sep 14, 2017 via email

Uh oh!

stale bot commented Oct 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

joemario commented Jun 1, 2016

Uh oh!

liewegas commented Mar 28, 2017

Uh oh!

bengland2 commented Mar 28, 2017

Uh oh!

joemario commented Mar 31, 2017 via email

Uh oh!

bengland2 commented Apr 10, 2017

Uh oh!

code-with-amitk Aug 21, 2017

Choose a reason for hiding this comment

Uh oh!

liewegas commented Sep 13, 2017

Uh oh!

aclamk commented Sep 14, 2017

Uh oh!

joemario commented Sep 14, 2017 via email

Uh oh!

stale bot commented Oct 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants