Skip to content

erasure-code: Build the avx512 versions of functions in ISA-L#59494

Closed
jamiepryde wants to merge 1 commit intomainfrom
isa-enable-avx512
Closed

erasure-code: Build the avx512 versions of functions in ISA-L#59494
jamiepryde wants to merge 1 commit intomainfrom
isa-enable-avx512

Conversation

@jamiepryde
Copy link
Contributor

@jamiepryde jamiepryde commented Aug 28, 2024

Upon inspecting /build/lib/libec_isa.so, I noticed that the AVX512 versions of ISA functions are not being included by the Ceph build process.

This PR updates the CMakeLists file to build AVX512 functions in libec_isa.so if the assembler supports AVX512.

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

Signed-off-by: Jamie Pryde <jamiepry@uk.ibm.com>
@jamiepryde jamiepryde requested a review from a team as a code owner August 28, 2024 22:52
@jamiepryde jamiepryde self-assigned this Aug 28, 2024
@jamiepryde
Copy link
Contributor Author

jamiepryde commented Aug 28, 2024

ceph_erasure_code_benchmark shows a small performance boost encoding for bigger K/M EC profiles (and a small performance decrease for 2+1) when AVX512 is used vs AVX2 on an Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz.

Decoding shows a small performance decrease for 1 erasure on smaller K/M profiles, but a small increase when K/M is bigger or the number of erasures is greater than 1.

Vandermonde encode

AVX2, k/m, encode GB/s AVX512, k/m, encode GB/s
[ '2/1', 3.87434861351630673038 ], [ '2/1', 3.66255117023490958596 ],
[ '3/2', 7.29331106520150076503 ], [ '3/2', 7.28385254965140064405 ],
[ '4/2', 10.47403645567991076456 ], [ '4/2', 10.82696534539578534822 ],
[ '4/3', 8.40265185003538524747 ], [ '4/3', 9.21004080121755265785 ],
[ '6/2', 10.32593610806381263299 ], [ '6/2', 10.38495792845844022916 ],
[ '6/3', 8.38572285736059842665 ], [ '6/3', 9.36008750858132822786 ],
[ '6/4', 6.95057330552807453327 ], [ '6/4', 7.98587330955033783438 ],
[ '10/3', 8.44740711906924440984 ], [ '10/3', 9.64825399020910622279 ],
[ '10/4', 6.65877415483642215221 ], [ '10/4', 7.86652730956207231264 ],

Cauchy encode

AVX2, k/m, encode GB/s AVX512, k/m, encode GB/s
[ '2/1', 3.87003997101123339474 ], [ '2/1', 3.73888265675310405029 ],
[ '3/2', 7.27699333730133226029 ], [ '3/2', 7.31369845188311937547 ],
[ '4/2', 10.47744984570488259221 ], [ '4/2', 10.78481718613248518079 ],
[ '4/3', 8.39452607057741445373 ], [ '4/3', 9.25180299136695759269 ],
[ '6/2', 10.33582672060920851050 ], [ '6/2', 10.38356386506483995688 ],
[ '6/3', 8.38004706770756236900 ], [ '6/3', 9.36996645402089997253 ],
[ '6/4', 6.96185216902352281749 ], [ '6/4', 7.98080724396804203855 ],
[ '10/3', 8.50961627485216413838 ], [ '10/3', 9.66204784707930979895 ],
[ '10/4', 6.65644237118452723283 ], [ '10/4', 7.84540362279399802772 ],

Vandermonde decode

AVX2, k/m/erasures, decode GB/s AVX512, k/m/erasures, decode GB/s
[ '2/1/1', 3.83126824909692409846 ], [ '2/1/1', 3.63177497452527761868 ],
[ '3/2/1', 10.34914881976650995571 ], [ '3/2/1', 9.10571133525568254663 ],
[ '3/2/2', 9.14002459178456567188 ], [ '3/2/2', 9.02480159876165282382 ],
[ '4/2/1', 9.88409827292383329685 ], [ '4/2/1', 8.91753406926056092858 ],
[ '4/2/2', 8.21090113391887787328 ], [ '4/2/2', 8.42487919397226179294 ],
[ '4/3/1', 9.29290646429442306662 ], [ '4/3/1', 8.79789008296199198871 ],
[ '4/3/2', 6.57384882966558147323 ], [ '4/3/2', 6.78562463791906932063 ],
[ '4/3/3', 6.24912512248285240066 ], [ '4/3/3', 6.63239751088774375387 ],
[ '6/2/1', 13.66846813431752135780 ], [ '6/2/1', 13.18007055239046412405 ],
[ '6/2/2', 11.01181876484393169500 ], [ '6/2/2', 11.58352966643882636048 ],
[ '6/3/1', 13.32277175043741324211 ], [ '6/3/1', 12.71957031052846660186 ],
[ '6/3/2', 11.02326985811199783273 ], [ '6/3/2', 11.48777112977909659432 ],
[ '6/3/3', 8.39051412774631596018 ], [ '6/3/3', 9.74338572208933110176 ],
[ '6/4/1', 12.82689074526492762760 ], [ '6/4/1', 12.41590952794910629289 ],
[ '6/4/2', 10.33437746371558734979 ], [ '6/4/2', 10.85526270083104417553 ],
[ '6/4/3', 8.31216059785846824302 ], [ '6/4/3', 9.32689931485343785031 ],
[ '6/4/4', 6.69667347834435087911 ], [ '6/4/4', 7.80803380466373234509 ],
[ '10/3/1', 14.28153591634904530788 ], [ '10/3/1', 13.99019030240299985494 ],
[ '10/3/2', 10.96816039556977575113 ], [ '10/3/2', 13.12833249592076452686 ],
[ '10/3/3', 8.44320760834324003026 ], [ '10/3/3', 10.30607560471104732154 ],
[ '10/4/1', 14.62081330086966936843 ], [ '10/4/1', 14.83325991858557628045 ],
[ '10/4/2', 10.00719877851331133564 ], [ '10/4/2', 11.59048964162576923761 ],
[ '10/4/3', 8.84200066322078574686 ], [ '10/4/3', 10.51600788784719651642 ],
[ '10/4/4', 6.76514114357512921960 ], [ '10/4/4', 8.26088427677478159543 ],

Cauchy decode

AVX2, k/m/erasures, decode GB/s AVX512, k/m/erasures, decode GB/s
[ '2/1/1', 3.89734651189357911850 ], [ '2/1/1', 3.74250779878745142524 ],
[ '3/2/1', 9.77916914927174136178 ], [ '3/2/1', 9.81468616245456978069 ],
[ '3/2/2', 9.15394683036169367642 ], [ '3/2/2', 9.07178115786699056070 ],
[ '4/2/1', 9.67001274894480820883 ], [ '4/2/1', 9.60858469390892599083 ],
[ '4/2/2', 8.20850280060981952326 ], [ '4/2/2', 8.40570759650021303425 ],
[ '4/3/1', 9.51646027875158658425 ], [ '4/3/1', 9.58912516493295283684 ],
[ '4/3/2', 6.57029394548989793652 ], [ '4/3/2', 6.80101852053363511719 ],
[ '4/3/3', 6.21805907414791698005 ], [ '4/3/3', 6.64271745599863957146 ],
[ '6/2/1', 13.86170762043495155277 ], [ '6/2/1', 13.92413861936707326636 ],
[ '6/2/2', 11.06686725142222524421 ], [ '6/2/2', 11.49938768060478315650 ],
[ '6/3/1', 12.96050219561275595436 ], [ '6/3/1', 13.01616920603450430262 ],
[ '6/3/2', 11.04283985797317773706 ], [ '6/3/2', 11.50456943090484450976 ],
[ '6/3/3', 8.39036093856054162263 ], [ '6/3/3', 9.74359230190731209053 ],
[ '6/4/1', 12.92168096803858393262 ], [ '6/4/1', 12.97404029467055486545 ],
[ '6/4/2', 10.41950077087634503251 ], [ '6/4/2', 10.79790468819106590004 ],
[ '6/4/3', 8.29619293679389442676 ], [ '6/4/3', 9.27958360949374155186 ],
[ '6/4/4', 6.68714681165399568256 ], [ '6/4/4', 7.77033814522401449733 ],
[ '10/3/1', 13.33500607649556893750 ], [ '10/3/1', 14.86863265675107292053 ],
[ '10/3/2', 10.92104417064367411184 ], [ '10/3/2', 13.07301645735895784004 ],
[ '10/3/3', 8.47727060641375374084 ], [ '10/3/3', 10.25910056085271674114 ],
[ '10/4/1', 13.37786628811942731601 ], [ '10/4/1', 14.95996951517252076204 ],
[ '10/4/2', 10.01671910619453141627 ], [ '10/4/2', 11.50136838680518853411 ],
[ '10/4/3', 8.85672566993690043509 ], [ '10/4/3', 10.53353270080452594720 ],
[ '10/4/4', 6.81567365027851568804 ], [ '10/4/4', 8.33253341012596124105 ],

@jamiepryde
Copy link
Contributor Author

jenkins test make check

1 similar comment
@jamiepryde
Copy link
Contributor Author

jenkins test make check

Copy link
Member

@markhpc markhpc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wins look like they are bigger than the losses to me. Generally think this is the right path forward. Thanks Jamie!

Edit: Are there cases where we want to support AVX2 when AVX512 isn't available? It looks like we are replacing it?
Edit2: Nope, too early and read too fast, I see the else condition now.

@cbodley
Copy link
Contributor

cbodley commented Aug 29, 2024

thanks @jamiepryde. this reminded me of an old commit from #52385 that switches to isal's build system instead of hacking together our own cmake stuff for it. i've just split that commit out to #59513. would you be willing to look into that and confirm whether or not it enables the same avx512 functionality?

in general i'd prefer not to maintain our own complicated build scripts if we don't have to

@jamiepryde
Copy link
Contributor Author

Hey @cbodley,

I think your new PR is actually related to the other isa-l library in Ceph, isa-l_crypto.
This PR is changing how we build /src/isa-l/ into libec_isa.so, which is used for erasure coding.
#59513 is changing how src/crypto/isa-l is built. This is isa-l_crypto, which provides encryption and hashing functions.

To make things even more confusing, /src/nvmeof/spdk/ has separate copies of isa-l and isa-l_crypto in it as well! I'm not sure if these are used?

@cbodley
Copy link
Contributor

cbodley commented Aug 29, 2024

I think your new PR is actually related to the other isa-l library in Ceph, isa-l_crypto.

oops, right you are. this ec plugin could probably use similar treatment eventually, but i'm happy with this pr in the meantime 👍

i have a feeling that our crypto library is also missing avx512 support

To make things even more confusing, /src/nvmeof/spdk/ has separate copies of isa-l and isa-l_crypto in it as well! I'm not sure if these are used?

fun 🙃 the nvmeof submodule was added recently in #54671, where i pointed out that it duplicates our src/spdk submodule. so the ceph tree now has 2 of those, and 3 dpdk submodules (one under each spdk submodule, and another under src/seastar)

@jamiepryde
Copy link
Contributor Author

jenkins test make check

1 similar comment
@jamiepryde
Copy link
Contributor Author

jenkins test make check

@markhpc
Copy link
Member

markhpc commented Sep 5, 2024

I think your new PR is actually related to the other isa-l library in Ceph, isa-l_crypto.

oops, right you are. this ec plugin could probably use similar treatment eventually, but i'm happy with this pr in the meantime 👍

i have a feeling that our crypto library is also missing avx512 support

To make things even more confusing, /src/nvmeof/spdk/ has separate copies of isa-l and isa-l_crypto in it as well! I'm not sure if these are used?

fun 🙃 the nvmeof submodule was added recently in #54671, where i pointed out that it duplicates our src/spdk submodule. so the ceph tree now has 2 of those, and 3 dpdk submodules (one under each spdk submodule, and another under src/seastar)

oof. Yeah, that can't stay that way. Frankly the merge shouldn't have been allowed if they didn't consolidate that. No use crying over spilled milk though. Do you know if anyone is planning to fix it?

@cbodley
Copy link
Contributor

cbodley commented Sep 5, 2024

Do you know if anyone is planning to fix it?

i haven't heard anything since. probably worth creating a tracker issue for it

@jamiepryde
Copy link
Contributor Author

Closing this PR as Casey has improved the build process for isa-l and isa-l-crypto here: #59513

@jamiepryde jamiepryde closed this Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants