Skip to content

erasure-code/isa: Use isa/raid's xor_gen() instead of the region_xor(…#58594

Merged
yuriw merged 1 commit intoceph:mainfrom
jamiepryde:isa-xor-raid
Aug 19, 2024
Merged

erasure-code/isa: Use isa/raid's xor_gen() instead of the region_xor(…#58594
yuriw merged 1 commit intoceph:mainfrom
jamiepryde:isa-xor-raid

Conversation

@jamiepryde
Copy link
Contributor

@jamiepryde jamiepryde commented Jul 15, 2024

…) optimisation

When using the ISA plugin for erasure coded pools, we use an optimisation to simplify the xor encode and decode and improve performance. This optimisation applies when m=1 in the pool's erasure code profile (or if m in the profile is greater than 1 but the technique in use is reed_sol_van and the erasure is in a data chunk or the first parity chunk).

The current optimisation does improve decoding performance, but only takes advantage of SSE2 SIMD instructions on x86-64 CPUs. The ISA xor_gen() function in the ISA raid sub-directory can take advantage of newer SIMD instructions, and so using this results in a slight improvement to decoding performance when using ISA and the correct conditions to use the optimisation are met.

This pull request removes the region_xor code which is no longer needed, and instead uses the ISA raid xor_gen() function.

I have tested this change using the ISA unit tests

[root@533f6de09e3b build]# bin/unittest_erasure_code_isa
[==========] Running 10 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 10 tests from IsaErasureCodeTest
[ RUN      ] IsaErasureCodeTest.encode_decode
[       OK ] IsaErasureCodeTest.encode_decode (0 ms)
[ RUN      ] IsaErasureCodeTest.minimum_to_decode
[       OK ] IsaErasureCodeTest.minimum_to_decode (0 ms)
[ RUN      ] IsaErasureCodeTest.chunk_size
[       OK ] IsaErasureCodeTest.chunk_size (0 ms)
[ RUN      ] IsaErasureCodeTest.encode
[       OK ] IsaErasureCodeTest.encode (1 ms)
[ RUN      ] IsaErasureCodeTest.sanity_check_k
[       OK ] IsaErasureCodeTest.sanity_check_k (0 ms)
[ RUN      ] IsaErasureCodeTest.isa_vandermonde_exhaustive
[       OK ] IsaErasureCodeTest.isa_vandermonde_exhaustive (53 ms)
[ RUN      ] IsaErasureCodeTest.isa_cauchy_exhaustive
[       OK ] IsaErasureCodeTest.isa_cauchy_exhaustive (53 ms)
[ RUN      ] IsaErasureCodeTest.isa_cauchy_cache_trash
[       OK ] IsaErasureCodeTest.isa_cauchy_cache_trash (217 ms)
[ RUN      ] IsaErasureCodeTest.isa_xor_codec
[       OK ] IsaErasureCodeTest.isa_xor_codec (1 ms)
[ RUN      ] IsaErasureCodeTest.create_rule
[       OK ] IsaErasureCodeTest.create_rule (0 ms)
[----------] 10 tests from IsaErasureCodeTest (326 ms total)

[----------] Global test environment tear-down
[==========] 10 tests from 1 test suite ran. (326 ms total)
[  PASSED  ] 10 tests.

And used the ceph erasure coding benchmark on an Intel Xeon Gold 6336Y CPU (TOTAL_SIZE=$((4 * 1024 * 1024 * 1024)) erasure-code/bench.sh fplot | tee erasure-code/bench.js).
Results show very similar performance encoding, but a minor improvement when decoding and M = 1, or number of erasures = 1

baseline, without this PR
Encode reed_sol_van
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size
isa	reed_sol_van	0.810144	4194304	2	1	1048576	256
isa	reed_sol_van	1.239091	4194304	3	2	1048576	160
isa	reed_sol_van	1.070481	4194304	4	2	1048576	128
isa	reed_sol_van	1.491363	4194304	4	3	1048576	128
isa	reed_sol_van	1.560132	4194304	6	2	1048576	80
isa	reed_sol_van	1.819574	4194304	6	3	1048576	80
isa	reed_sol_van	2.21709	4194304	6	4	1048576	80
isa	reed_sol_van	2.356909	4194304	10	3	1048576	48
isa	reed_sol_van	2.679524	4194304	10	4	1048576	48

with this PR
Encode reed_sol_van
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size
isa	reed_sol_van	0.819062	4194304	2	1	1048576	256
isa	reed_sol_van	1.239847	4194304	3	2	1048576	160
isa	reed_sol_van	1.05034	4194304	4	2	1048576	128
isa	reed_sol_van	1.492579	4194304	4	3	1048576	128
isa	reed_sol_van	1.546993	4194304	6	2	1048576	80
isa	reed_sol_van	1.795169	4194304	6	3	1048576	80
isa	reed_sol_van	2.198842	4194304	6	4	1048576	80
isa	reed_sol_van	2.340041	4194304	10	3	1048576	48
isa	reed_sol_van	2.718091	4194304	10	4	1048576	48

baseline, without this PR
Encode cauchy
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size
isa	cauchy	0.823824	4194304	2	1	1048576	256
isa	cauchy	1.263262	4194304	3	2	1048576	160
isa	cauchy	1.062328	4194304	4	2	1048576	128
isa	cauchy	1.517931	4194304	4	3	1048576	128
isa	cauchy	1.5555	4194304	6	2	1048576	80
isa	cauchy	1.804116	4194304	6	3	1048576	80
isa	cauchy	2.199236	4194304	6	4	1048576	80
isa	cauchy	2.338742	4194304	10	3	1048576	48
isa	cauchy	2.780651	4194304	10	4	1048576	48

with this PR
Encode cauchy
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size
isa	cauchy	0.824136	4194304	2	1	1048576	256
isa	cauchy	1.211994	4194304	3	2	1048576	160
isa	cauchy	1.045071	4194304	4	2	1048576	128
isa	cauchy	1.494725	4194304	4	3	1048576	128
isa	cauchy	1.53728	4194304	6	2	1048576	80
isa	cauchy	1.784101	4194304	6	3	1048576	80
isa	cauchy	2.221333	4194304	6	4	1048576	80
isa	cauchy	2.33441	4194304	10	3	1048576	48
isa	cauchy	2.652096	4194304	10	4	1048576	48

baseline, without this PR
Decode reed_sol_van
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size	Erasures
isa	reed_sol_van	0.899744	4194304	2	1	1048576	256	1
isa	reed_sol_van	1.724666	4194304	3	2	1048576	160	1
isa	reed_sol_van	2.314405	4194304	3	2	1048576	160	2
isa	reed_sol_van	1.920687	4194304	4	2	1048576	128	1
isa	reed_sol_van	2.726822	4194304	4	2	1048576	128	2
isa	reed_sol_van	2.110076	4194304	4	3	1048576	128	1
isa	reed_sol_van	2.825197	4194304	4	3	1048576	128	2
isa	reed_sol_van	3.418693	4194304	4	3	1048576	128	3
isa	reed_sol_van	2.331806	4194304	6	2	1048576	80	1
isa	reed_sol_van	3.413767	4194304	6	2	1048576	80	2
isa	reed_sol_van	2.61269	4194304	6	3	1048576	80	1
isa	reed_sol_van	3.637135	4194304	6	3	1048576	80	2
isa	reed_sol_van	4.215466	4194304	6	3	1048576	80	3
isa	reed_sol_van	2.876675	4194304	6	4	1048576	80	1
isa	reed_sol_van	3.881037	4194304	6	4	1048576	80	2
isa	reed_sol_van	4.636734	4194304	6	4	1048576	80	3
isa	reed_sol_van	5.281095	4194304	6	4	1048576	80	4
isa	reed_sol_van	3.657244	4194304	10	3	1048576	48	1
isa	reed_sol_van	4.894165	4194304	10	3	1048576	48	2
isa	reed_sol_van	5.783632	4194304	10	3	1048576	48	3
isa	reed_sol_van	4.099728	4194304	10	4	1048576	48	1
isa	reed_sol_van	5.413802	4194304	10	4	1048576	48	2
isa	reed_sol_van	6.250787	4194304	10	4	1048576	48	3
isa	reed_sol_van	7.005922	4194304	10	4	1048576	48	4

with this PR
Decode reed_sol_van
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size	Erasures
isa	reed_sol_van	0.885898	4194304	2	1	1048576	256	1
isa	reed_sol_van	1.638085	4194304	3	2	1048576	160	1
isa	reed_sol_van	2.302119	4194304	3	2	1048576	160	2
isa	reed_sol_van	1.871081	4194304	4	2	1048576	128	1
isa	reed_sol_van	2.72915	4194304	4	2	1048576	128	2
isa	reed_sol_van	2.052196	4194304	4	3	1048576	128	1
isa	reed_sol_van	2.836173	4194304	4	3	1048576	128	2
isa	reed_sol_van	3.340072	4194304	4	3	1048576	128	3
isa	reed_sol_van	2.301173	4194304	6	2	1048576	80	1
isa	reed_sol_van	3.440601	4194304	6	2	1048576	80	2
isa	reed_sol_van	2.573077	4194304	6	3	1048576	80	1
isa	reed_sol_van	3.647883	4194304	6	3	1048576	80	2
isa	reed_sol_van	4.226653	4194304	6	3	1048576	80	3
isa	reed_sol_van	2.841784	4194304	6	4	1048576	80	1
isa	reed_sol_van	3.871163	4194304	6	4	1048576	80	2
isa	reed_sol_van	4.646203	4194304	6	4	1048576	80	3
isa	reed_sol_van	5.235489	4194304	6	4	1048576	80	4
isa	reed_sol_van	3.407821	4194304	10	3	1048576	48	1
isa	reed_sol_van	4.936023	4194304	10	3	1048576	48	2
isa	reed_sol_van	5.77874	4194304	10	3	1048576	48	3
isa	reed_sol_van	3.81006	4194304	10	4	1048576	48	1
isa	reed_sol_van	5.40965	4194304	10	4	1048576	48	2
isa	reed_sol_van	6.272648	4194304	10	4	1048576	48	3
isa	reed_sol_van	7.001062	4194304	10	4	1048576	48	4

baseline, without this PR
Decode cauchy
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size	Erasures
isa	cauchy	0.905557	4194304	2	1	1048576	256	1
isa	cauchy	1.889728	4194304	3	2	1048576	160	1
isa	cauchy	2.308956	4194304	3	2	1048576	160	2
isa	cauchy	2.352925	4194304	4	2	1048576	128	1
isa	cauchy	2.733845	4194304	4	2	1048576	128	2
isa	cauchy	2.436355	4194304	4	3	1048576	128	1
isa	cauchy	2.825513	4194304	4	3	1048576	128	2
isa	cauchy	3.347051	4194304	4	3	1048576	128	3
isa	cauchy	2.87374	4194304	6	2	1048576	80	1
isa	cauchy	3.428421	4194304	6	2	1048576	80	2
isa	cauchy	3.141515	4194304	6	3	1048576	80	1
isa	cauchy	3.65206	4194304	6	3	1048576	80	2
isa	cauchy	4.21947	4194304	6	3	1048576	80	3
isa	cauchy	3.341173	4194304	6	4	1048576	80	1
isa	cauchy	3.89945	4194304	6	4	1048576	80	2
isa	cauchy	4.627455	4194304	6	4	1048576	80	3
isa	cauchy	5.234281	4194304	6	4	1048576	80	4
isa	cauchy	4.465539	4194304	10	3	1048576	48	1
isa	cauchy	4.911119	4194304	10	3	1048576	48	2
isa	cauchy	5.784103	4194304	10	3	1048576	48	3
isa	cauchy	4.642966	4194304	10	4	1048576	48	1
isa	cauchy	5.408909	4194304	10	4	1048576	48	2
isa	cauchy	6.407602	4194304	10	4	1048576	48	3
isa	cauchy	7.033372	4194304	10	4	1048576	48	4

with this PR
Decode cauchy
Plugin	Technique	Time	Total Size	k	m	Iteration	Packet Size	Erasures
isa	cauchy	0.883152	4194304	2	1	1048576	256	1
isa	cauchy	1.90028	4194304	3	2	1048576	160	1
isa	cauchy	2.316268	4194304	3	2	1048576	160	2
isa	cauchy	2.235922	4194304	4	2	1048576	128	1
isa	cauchy	2.817762	4194304	4	2	1048576	128	2
isa	cauchy	2.445407	4194304	4	3	1048576	128	1
isa	cauchy	2.839687	4194304	4	3	1048576	128	2
isa	cauchy	3.335637	4194304	4	3	1048576	128	3
isa	cauchy	2.866535	4194304	6	2	1048576	80	1
isa	cauchy	3.417326	4194304	6	2	1048576	80	2
isa	cauchy	3.128992	4194304	6	3	1048576	80	1
isa	cauchy	3.672589	4194304	6	3	1048576	80	2
isa	cauchy	4.229856	4194304	6	3	1048576	80	3
isa	cauchy	3.315932	4194304	6	4	1048576	80	1
isa	cauchy	3.881264	4194304	6	4	1048576	80	2
isa	cauchy	4.661972	4194304	6	4	1048576	80	3
isa	cauchy	5.298398	4194304	6	4	1048576	80	4
isa	cauchy	4.425217	4194304	10	3	1048576	48	1
isa	cauchy	4.934854	4194304	10	3	1048576	48	2
isa	cauchy	5.881886	4194304	10	3	1048576	48	3
isa	cauchy	4.67152	4194304	10	4	1048576	48	1
isa	cauchy	5.547466	4194304	10	4	1048576	48	2
isa	cauchy	6.277136	4194304	10	4	1048576	48	3
isa	cauchy	7.039528	4194304	10	4	1048576	48	4

Contribution Guidelines

  • To sign and title your commits, please refer to Submitting Patches to Ceph.

  • If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.

  • When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an x between the brackets: [x]. Spaces and capitalization matter when checking off items this way.

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows
  • jenkins test rook e2e

…) optimisation

Signed-off-by: Jamie Pryde <jamiepry@uk.ibm.com>
@jamiepryde
Copy link
Contributor Author

jenkins test make check

@jamiepryde
Copy link
Contributor Author

jenkins test make check arm64

1 similar comment
@jamiepryde
Copy link
Contributor Author

jenkins test make check arm64

@jamiepryde jamiepryde requested review from athanatos and markhpc July 19, 2024 12:50
@jamiepryde
Copy link
Contributor Author

Some additional data from the benchmark running in a CentOS 9 container on a macbook pro with an M1 pro CPU (ARM)

Encode
Plugin	Technique	Baseline Time	New Time	Total Size	k	m	Iteration	Packet Size
isa	reed_sol_van	0.529021	0.484336	4194304		2	1	1048576		256
isa	reed_sol_van	1.892235	1.914296	4194304		3	2	1048576		160
isa	reed_sol_van	1.818443	1.903929	4194304		4	2	1048576		128
isa	reed_sol_van	2.302543	2.015336	4194304		4	3	1048576		128
isa	reed_sol_van	2.487124	2.274934	4194304		6	2	1048576		80
isa	reed_sol_van	2.432974	2.424385	4194304		6	3	1048576		80
isa	reed_sol_van	3.048638	2.830223	4194304		6	4	1048576		80
isa	reed_sol_van	3.1322100	3.167969	4194304		10	3	1048576		48
isa	reed_sol_van	3.606801	3.634400	4194304		10	4	1048576		48

isa	cauchy		0.535701	0.482847	4194304		2	1	1048576		256
isa	cauchy		1.984469	1.918598	4194304		3	2	1048576		160
isa	cauchy		1.736924	1.775383	4194304		4	2	1048576		128
isa	cauchy		2.063875	2.000363	4194304		4	3	1048576		128
isa	cauchy		2.301490	2.285416	4194304		6	2	1048576		80
isa	cauchy		2.407579	2.373702	4194304		6	3	1048576		80
isa	cauchy		2.791605	2.841477	4194304		6	4	1048576		80
isa	cauchy		3.0511270	2.993474	4194304		10	3	1048576		48
isa	cauchy		3.562984	3.597210	4194304		10	4	1048576		48

Decode

Plugin	Technique	Baseline Time	New Time	Total Size	k	m	Iteration	Packet Size	Erasures
isa	reed_sol_van	0.824144	0.700814	4194304		2	1	1048576		256		1
isa	reed_sol_van	1.459985	1.389423	4194304		3	2	1048576		160		1
isa	reed_sol_van	2.7416510	2.700939	4194304		3	2	1048576		160		2
isa	reed_sol_van	1.6147340	1.539243	4194304		4	2	1048576		128		1
isa	reed_sol_van	2.986448	2.857361	4194304		4	2	1048576		128		2
isa	reed_sol_van	1.876038	1.882001	4194304		4	3	1048576		128		1
isa	reed_sol_van	3.062447	3.084523	4194304		4	3	1048576		128		2
isa	reed_sol_van	3.434438	3.301625	4194304		4	3	1048576		128		3
isa	reed_sol_van	2.022273	1.910465	4194304		6	2	1048576		80		1
isa	reed_sol_van	3.789703	3.606070	4194304		6	2	1048576		80		2
isa	reed_sol_van	2.421964	2.269290	4194304		6	3	1048576		80		1
isa	reed_sol_van	3.877603	3.774262	4194304		6	3	1048576		80		2
isa	reed_sol_van	4.062006	3.983650	4194304		6	3	1048576		80		3
isa	reed_sol_van	2.588132	2.6101130	4194304		6	4	1048576		80		1
isa	reed_sol_van	4.044973	4.044441	4194304		6	4	1048576		80		2
isa	reed_sol_van	4.281397	4.374214	4194304		6	4	1048576		80		3
isa	reed_sol_van	4.944261	4.867536	4194304		6	4	1048576		80		4
isa	reed_sol_van	3.047575	3.052979	4194304		10	3	1048576		48		1
isa	reed_sol_van	5.031783	4.9418110	4194304		10	3	1048576		48		2
isa	reed_sol_van	5.643340	5.636444	4194304		10	3	1048576		48		3
isa	reed_sol_van	3.446319	3.344000	4194304		10	4	1048576		48		1
isa	reed_sol_van	5.289960	5.313369	4194304		10	4	1048576		48		2
isa	reed_sol_van	5.989640	5.966225	4194304		10	4	1048576		48		3
isa	reed_sol_van	6.874030	6.725350	4194304		10	4	1048576		48		4

isa	cauchy		0.774834	0.696212	4194304		2	1	1048576		256		1
isa	cauchy		2.257658	2.260327	4194304		3	2	1048576		160		1
isa	cauchy		2.7141160	2.7527410	4194304		3	2	1048576		160		2
isa	cauchy		2.473062	2.503689	4194304		4	2	1048576		128		1
isa	cauchy		2.912288	2.976777	4194304		4	2	1048576		128		2
isa	cauchy		2.712970	2.660301	4194304		4	3	1048576		128		1
isa	cauchy		3.027421	3.093792	4194304		4	3	1048576		128		2
isa	cauchy		3.365696	3.416050	4194304		4	3	1048576		128		3
isa	cauchy		3.140242	3.075421	4194304		6	2	1048576		80		1
isa	cauchy		3.679346	3.624555	4194304		6	2	1048576		80		2
isa	cauchy		3.516186	3.265223	4194304		6	3	1048576		80		1
isa	cauchy		4.068511	3.796682	4194304		6	3	1048576		80		2
isa	cauchy		4.453731	3.980017	4194304		6	3	1048576		80		3
isa	cauchy		3.543436	3.526323	4194304		6	4	1048576		80		1
isa	cauchy		4.156210	4.235436	4194304		6	4	1048576		80		2
isa	cauchy		4.514992	4.378862	4194304		6	4	1048576		80		3
isa	cauchy		5.129092	5.1993710	4194304		6	4	1048576		80		4
isa	cauchy		4.392270	4.401733	4194304		10	3	1048576		48		1
isa	cauchy		5.080615	4.974371	4194304		10	3	1048576		48		2
isa	cauchy		5.750088	5.667986	4194304		10	3	1048576		48		3
isa	cauchy		4.797167	4.646359	4194304		10	4	1048576		48		1
isa	cauchy		5.424269	5.289619	4194304		10	4	1048576		48		2
isa	cauchy		6.086570	5.776426	4194304		10	4	1048576		48		3
isa	cauchy		6.831321	6.945226	4194304		10	4	1048576		48		4

@markhpc
Copy link
Member

markhpc commented Jul 25, 2024

@jamiepryde looks like some decent gains in places! One question: A while back I beleive I recall that isa-l was implicated in one of the various hardware security vulnerability issues that had come up (downfall perhaps?). Do you happen to know if that could have any effect here?

Copy link
Member

@markhpc markhpc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dvanders
Copy link
Contributor

@apeters1971 FYI

@NitzanMordhai
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants