Skip to content

crc32c optimized for s390x arch#56224

Merged
yuriw merged 1 commit intoceph:mainfrom
ibm-s390-cloud:master-s390x-crc32c
Jul 30, 2024
Merged

crc32c optimized for s390x arch#56224
yuriw merged 1 commit intoceph:mainfrom
ibm-s390-cloud:master-s390x-crc32c

Conversation

@aliakseimakarau
Copy link
Contributor

@aliakseimakarau aliakseimakarau commented Mar 15, 2024

This commit add the s390x arch HW crc32 use.

Fixes: https://tracker.ceph.com/issues/64952

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

@aliakseimakarau
Copy link
Contributor Author

aliakseimakarau commented Mar 18, 2024

BTW, do you have an s390x teuthology facility at RH?
Is there any way to setup it at your side?
We have here a small integration test cluster, nevertheless would be great to have a kind of an official test facility :)

@idryomov idryomov requested a review from a team March 22, 2024 10:16
@aliakseimakarau
Copy link
Contributor Author

@idryomov , thank you!!!

@idryomov
Copy link
Contributor

BTW, do you have an s390x teuthology facility at RH? Is there any way to setup it at your side? We have here a small integration test cluster, nevertheless would be great to have a kind of an official test facility :)

We certainly don't have hardware like that in the upstream lab, which is what upstream contributions go through. So we would rely on you to run all the necessary tests and benchmarks and post the results here in the PR ;)

https://github.com/linux-on-ibm-z/crc32-s390x/blob/master/README.md notes a 70x speedup in a synthetic benchmark. Most of that would obviously evaporate in an end-to-end integration test with the rest of Ceph in place, but I'd be curious to see pre and post numbers for e.g. bufferlist::crc32c().

@aliakseimakarau
Copy link
Contributor Author

BTW, do you have an s390x teuthology facility at RH? Is there any way to setup it at your side? We have here a small integration test cluster, nevertheless would be great to have a kind of an official test facility :)

We certainly don't have hardware like that in the upstream lab, which is what upstream contributions go through. So we would rely on you to run all the necessary tests and benchmarks and post the results here in the PR ;)

https://github.com/linux-on-ibm-z/crc32-s390x/blob/master/README.md notes a 70x speedup in a synthetic benchmark. Most of that would obviously evaporate in an end-to-end integration test with the rest of Ceph in place, but I'd be curious to see pre and post numbers for e.g. bufferlist::crc32c().

Thank you very much for the clarification!
With the HW crc functions we noticed a huge use of the CPU (fio was employed to load the OSD):
image

We have a small homebrew teuthology cluster on the s390x nodes. Let me reanimate it and test.

@idryomov
Copy link
Contributor

With the HW crc functions we noticed a huge use of the CPU

I'm confused -- did you mean "without" or perhaps "SW" instead of "HW" here?

@aliakseimakarau
Copy link
Contributor Author

aliakseimakarau commented Mar 22, 2024

With the HW crc functions we noticed a huge use of the CPU

I'm confused -- did you mean "without" or perhaps "SW" instead of "HW" here?

Without of course. My typo.

@aliakseimakarau aliakseimakarau requested a review from a team as a code owner March 25, 2024 13:23
@aliakseimakarau
Copy link
Contributor Author

Hi Team,
I manually run teuthology tests @s390x nodes and had no problems.

@aliakseimakarau
Copy link
Contributor Author

fio -ioengine=libaio -direct=1 -name=test -group_reporting=1 -runtime=180 -time_based -iodepth=8 -bs=128K -rw=write -numjobs=8 -size=3G

ceph_crc32c_sctp

Samples: 1M of event 'cycles:P', Event count (approx.): 2211775104000
Overhead  Command          Shared Object        Symbol
  19.58%  tp_osd_tp        ceph-osd             [.] ceph_crc32c_sctp
  18.78%  msgr-worker-1    ceph-osd             [.] ceph_crc32c_sctp
   5.88%  msgr-worker-1    [kernel.kallsyms]    [k] raw_copy_to_user_key
   3.48%  tp_osd_tp        libc.so.6            [.] memcpy
   3.04%  tp_osd_tp        libc.so.6            [.] _int_malloc
   2.20%  tp_osd_tp        [kernel.kallsyms]    [k] cpumsf_pmu_enable
   1.19%  tp_osd_tp        libc.so.6            [.] malloc
   1.13%  tp_osd_tp        libc.so.6            [.] malloc_consolidate
   0.88%  tp_osd_tp        ceph-osd             [.] PeeringState::calc_trim_to_aggressive()
   0.87%  bstore_aio       [kernel.kallsyms]    [k] cpumsf_pmu_enable
   0.75%  tp_osd_tp        libc.so.6            [.] cfree@GLIBC_2.2
   0.73%  bstore_kv_sync   [kernel.kallsyms]    [k] cpumsf_pmu_enable
   0.69%  tp_osd_tp        libc.so.6            [.] _int_free
   0.61%  tp_osd_tp        ceph-osd             [.] PGLog::IndexedLog::trim(ceph::common::CephContext*, eversion_t, std::set<eversion_t, std::less<eversion_t>, std::allo

s390x HW crc32

Samples: 874K of event 'cycles:P', Event count (approx.): 1138476612000
Overhead  Command          Shared Object         Symbol
   8.58%  msgr-worker-1    [kernel.kallsyms]     [k] raw_copy_to_user_key
   5.27%  tp_osd_tp        libc.so.6             [.] memcpy
   4.70%  tp_osd_tp        libc.so.6             [.] _int_malloc
   3.03%  tp_osd_tp        [kernel.kallsyms]     [k] cpumsf_pmu_enable
   1.97%  tp_osd_tp        ceph-osd              [.] crc32_le_vgfm_generic
   1.75%  tp_osd_tp        libc.so.6             [.] malloc_consolidate
   1.72%  tp_osd_tp        libc.so.6             [.] malloc
   1.64%  msgr-worker-1    ceph-osd              [.] crc32_le_vgfm_generic
   1.43%  msgr-worker-1    [kernel.kallsyms]     [k] cpumsf_pmu_enable
   1.31%  tp_osd_tp        ceph-osd              [.] PeeringState::calc_trim_to_aggressive()
   1.21%  bstore_aio       [kernel.kallsyms]     [k] cpumsf_pmu_enable
   1.15%  tp_osd_tp        libc.so.6             [.] cfree@GLIBC_2.2
   1.12%  tp_osd_tp        libc.so.6             [.] unlink_chunk.isra.0
   1.02%  tp_osd_tp        libc.so.6             [.] _int_free
   0.86%  tp_osd_tp        ceph-osd              [.] PGLog::IndexedLog::trim(ceph::common::CephContext*, eversion_t, std::set<eversion_t, std::less<eversion_t>, std::allocator<eversion_t> >*, std::set<std
   0.71%  bstore_kv_sync   [kernel.kallsyms]     [k] cpumsf_pmu_enable
   0.70%  bstore_kv_sync   ceph-osd              [.] unsigned int rocksdb::crc32c::ExtendImpl<&rocksdb::crc32c::Slow_CRC32>(unsigned int, char const*, unsigned long)
   0.66%  bstore_kv_sync   ceph-osd              [.] rocksdb::InlineSkipList<rocksdb::MemTableRep::KeyComparator const&>::RecomputeSpliceLevels(rocksdb::Slice const&, rocksdb::InlineSkipList<rocksdb::MemT
   0.64%  msgr-worker-1    libc.so.6             [.] _int_malloc
   0.61%  bstore_kv_sync   libc.so.6             [.] memcpy

Copy link
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash "Added copyright and licenses" into the original commit -- the copyright should be added together with the code in question. Otherwise LGTM!

@aliakseimakarau aliakseimakarau force-pushed the master-s390x-crc32c branch 2 times, most recently from b06f782 to ff02cfc Compare May 22, 2024 08:07
@aliakseimakarau
Copy link
Contributor Author

@idryomov @cbodley ping :-)

@idryomov
Copy link
Contributor

idryomov commented May 22, 2024

@idryomov @cbodley ping :-)

See the unresolved conversation above and please add Fixes: https://tracker.ceph.com/issues/64952 line like you have in the PR description to the commit message.

@aliakseimakarau
Copy link
Contributor Author

@idryomov @cbodley ping :-)

See the unresolved conversation above and please add Fixes: https://tracker.ceph.com/issues/64952 line like you have in the PR description to the commit message.

Done!

Fixes: https://tracker.ceph.com/issues/64952
Signed-off-by: Aliaksei Makarau <aliaksei.makarau@ibm.com>
Copy link
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'm adding needs-qa label so that this goes through the full build just in case. It would be good to get an ack from @ceph/core too.

@idryomov
Copy link
Contributor

jenkins test make check

Copy link
Contributor

@athanatos athanatos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't audit the implementation, but the ceph integration lgtm.

@aliakseimakarau
Copy link
Contributor Author

Hi Casey @cbodley ,
could you please have a look? Thank you very much!!!

@cbodley cbodley removed their request for review June 28, 2024 18:12
@cbodley
Copy link
Contributor

cbodley commented Jun 28, 2024

my comments were addressed, thanks 👍 this pr already has approvals, so i don't plan to review further

@cbodley
Copy link
Contributor

cbodley commented Jun 28, 2024

@idryomov @athanatos this is labeled needs-qa but it's not clear who's going to test it. should we add the core label to go through the rados suite?

@idryomov idryomov added the core label Jun 28, 2024
@idryomov
Copy link
Contributor

@cbodley I assumed that the common label was enough for it to be picked up into a rados batch. I went ahead and added the core label, thanks!

@NitzanMordhai
Copy link
Contributor

@yuriw yuriw merged commit 3585ce7 into ceph:main Jul 30, 2024
@batrick
Copy link
Member

batrick commented Sep 16, 2024

PTAL #59809

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants