Add crypto_sign_pk_from_sk to top-level API#714
Conversation
510f9bb to
bd3181c
Compare
mkannwischer
left a comment
There was a problem hiding this comment.
Thanks @jakemas!
Two blockers:
- The function needs to be added to mldsa_native.h
- I think some of the declassifications need to be removed
I also wonder if we should rename the function to crypto_sign_pk_from_sk to align with other functions?
00ff754 to
7fa39ba
Compare
crypto_sign_pk_from_sk to top-level API
mkannwischer
left a comment
There was a problem hiding this comment.
Thanks @jakemas for the changes! This looks good to me now.
7fa39ba to
d5e1c73
Compare
There was a problem hiding this comment.
Mac Mini (M1, 2020) benchmarks (opt)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
46487 cycles |
46496 cycles |
1.00 |
ML-DSA-44 sign |
132757 cycles |
132772 cycles |
1.00 |
ML-DSA-44 verify |
47842 cycles |
47842 cycles |
1 |
ML-DSA-65 keypair |
81474 cycles |
81453 cycles |
1.00 |
ML-DSA-65 sign |
219244 cycles |
219163 cycles |
1.00 |
ML-DSA-65 verify |
80118 cycles |
80109 cycles |
1.00 |
ML-DSA-87 keypair |
132625 cycles |
132596 cycles |
1.00 |
ML-DSA-87 sign |
281125 cycles |
281212 cycles |
1.00 |
ML-DSA-87 verify |
130395 cycles |
130395 cycles |
1 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
115791 cycles |
115546 cycles |
1.00 |
ML-DSA-44 sign |
376960 cycles |
377228 cycles |
1.00 |
ML-DSA-44 verify |
120301 cycles |
120526 cycles |
1.00 |
ML-DSA-65 keypair |
200397 cycles |
200179 cycles |
1.00 |
ML-DSA-65 sign |
623689 cycles |
622967 cycles |
1.00 |
ML-DSA-65 verify |
197920 cycles |
198351 cycles |
1.00 |
ML-DSA-87 keypair |
327697 cycles |
327201 cycles |
1.00 |
ML-DSA-87 sign |
791495 cycles |
791268 cycles |
1.00 |
ML-DSA-87 verify |
324898 cycles |
324856 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Mac Mini (M1, 2020) benchmarks (no-opt)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
115249 cycles |
115268 cycles |
1.00 |
ML-DSA-44 sign |
430426 cycles |
430432 cycles |
1.00 |
ML-DSA-44 verify |
122140 cycles |
122206 cycles |
1.00 |
ML-DSA-65 keypair |
197283 cycles |
197166 cycles |
1.00 |
ML-DSA-65 sign |
700190 cycles |
700196 cycles |
1.00 |
ML-DSA-65 verify |
197625 cycles |
197602 cycles |
1.00 |
ML-DSA-87 keypair |
325694 cycles |
325565 cycles |
1.00 |
ML-DSA-87 sign |
884149 cycles |
884047 cycles |
1.00 |
ML-DSA-87 verify |
329030 cycles |
328865 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
213789 cycles |
213817 cycles |
1.00 |
ML-DSA-44 sign |
782200 cycles |
784050 cycles |
1.00 |
ML-DSA-44 verify |
229962 cycles |
229513 cycles |
1.00 |
ML-DSA-65 keypair |
385235 cycles |
384799 cycles |
1.00 |
ML-DSA-65 sign |
1307962 cycles |
1314461 cycles |
1.00 |
ML-DSA-65 verify |
375315 cycles |
375898 cycles |
1.00 |
ML-DSA-87 keypair |
605968 cycles |
607035 cycles |
1.00 |
ML-DSA-87 sign |
1623238 cycles |
1623460 cycles |
1.00 |
ML-DSA-87 verify |
617584 cycles |
617252 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
34826 cycles |
35939 cycles |
0.97 |
ML-DSA-44 sign |
120719 cycles |
120718 cycles |
1.00 |
ML-DSA-44 verify |
38106 cycles |
38141 cycles |
1.00 |
ML-DSA-65 keypair |
61681 cycles |
62873 cycles |
0.98 |
ML-DSA-65 sign |
200838 cycles |
201460 cycles |
1.00 |
ML-DSA-65 verify |
62662 cycles |
62704 cycles |
1.00 |
ML-DSA-87 keypair |
94020 cycles |
95260 cycles |
0.99 |
ML-DSA-87 sign |
234791 cycles |
235558 cycles |
1.00 |
ML-DSA-87 verify |
94036 cycles |
94737 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Intel Xeon 4th gen (c7i) (no-opt)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
95908 cycles |
96057 cycles |
1.00 |
ML-DSA-44 sign |
349152 cycles |
349268 cycles |
1.00 |
ML-DSA-44 verify |
101404 cycles |
101803 cycles |
1.00 |
ML-DSA-65 keypair |
163022 cycles |
163944 cycles |
0.99 |
ML-DSA-65 sign |
565178 cycles |
565147 cycles |
1.00 |
ML-DSA-65 verify |
165073 cycles |
164858 cycles |
1.00 |
ML-DSA-87 keypair |
267098 cycles |
268355 cycles |
1.00 |
ML-DSA-87 sign |
723299 cycles |
723103 cycles |
1.00 |
ML-DSA-87 verify |
270824 cycles |
271795 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
69326 cycles |
69288 cycles |
1.00 |
ML-DSA-44 sign |
185135 cycles |
184700 cycles |
1.00 |
ML-DSA-44 verify |
69807 cycles |
69124 cycles |
1.01 |
ML-DSA-65 keypair |
119325 cycles |
119443 cycles |
1.00 |
ML-DSA-65 sign |
294920 cycles |
295201 cycles |
1.00 |
ML-DSA-65 verify |
115135 cycles |
115118 cycles |
1.00 |
ML-DSA-87 keypair |
202804 cycles |
201971 cycles |
1.00 |
ML-DSA-87 sign |
386640 cycles |
385373 cycles |
1.00 |
ML-DSA-87 verify |
193896 cycles |
193641 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Intel Xeon 3rd gen (c6i)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
56848 cycles |
56917 cycles |
1.00 |
ML-DSA-44 sign |
180172 cycles |
179672 cycles |
1.00 |
ML-DSA-44 verify |
61090 cycles |
60918 cycles |
1.00 |
ML-DSA-65 keypair |
99403 cycles |
99920 cycles |
0.99 |
ML-DSA-65 sign |
297649 cycles |
296088 cycles |
1.01 |
ML-DSA-65 verify |
99947 cycles |
100006 cycles |
1.00 |
ML-DSA-87 keypair |
153370 cycles |
153471 cycles |
1.00 |
ML-DSA-87 sign |
352161 cycles |
351626 cycles |
1.00 |
ML-DSA-87 verify |
152016 cycles |
151962 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
AMD EPYC 4th gen (c7a)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
41377 cycles |
42240 cycles |
0.98 |
ML-DSA-44 sign |
130261 cycles |
130842 cycles |
1.00 |
ML-DSA-44 verify |
44326 cycles |
44445 cycles |
1.00 |
ML-DSA-65 keypair |
72592 cycles |
72530 cycles |
1.00 |
ML-DSA-65 sign |
212974 cycles |
212854 cycles |
1.00 |
ML-DSA-65 verify |
72910 cycles |
72838 cycles |
1.00 |
ML-DSA-87 keypair |
109007 cycles |
111974 cycles |
0.97 |
ML-DSA-87 sign |
249340 cycles |
252780 cycles |
0.99 |
ML-DSA-87 verify |
110323 cycles |
110309 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
Graviton2
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
116104 cycles |
115816 cycles |
1.00 |
ML-DSA-44 sign |
377627 cycles |
377624 cycles |
1.00 |
ML-DSA-44 verify |
120720 cycles |
120558 cycles |
1.00 |
ML-DSA-65 keypair |
201022 cycles |
200332 cycles |
1.00 |
ML-DSA-65 sign |
624955 cycles |
623594 cycles |
1.00 |
ML-DSA-65 verify |
198562 cycles |
198416 cycles |
1.00 |
ML-DSA-87 keypair |
328680 cycles |
327949 cycles |
1.00 |
ML-DSA-87 sign |
792082 cycles |
792354 cycles |
1.00 |
ML-DSA-87 verify |
325473 cycles |
325213 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
oqs-bot
left a comment
There was a problem hiding this comment.
AMD EPYC 3rd gen (c6a) (no-opt)
Details
| Benchmark suite | Current: d5e1c73 | Previous: 674b8bc | Ratio |
|---|---|---|---|
ML-DSA-44 keypair |
135698 cycles |
135799 cycles |
1.00 |
ML-DSA-44 sign |
541044 cycles |
541174 cycles |
1.00 |
ML-DSA-44 verify |
148924 cycles |
148797 cycles |
1.00 |
ML-DSA-65 keypair |
229510 cycles |
228686 cycles |
1.00 |
ML-DSA-65 sign |
893688 cycles |
889946 cycles |
1.00 |
ML-DSA-65 verify |
239113 cycles |
237363 cycles |
1.01 |
ML-DSA-87 keypair |
373804 cycles |
373747 cycles |
1.00 |
ML-DSA-87 sign |
1107920 cycles |
1107872 cycles |
1.00 |
ML-DSA-87 verify |
387419 cycles |
387864 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
hanno-becker
left a comment
There was a problem hiding this comment.
One question regarding declassification/zeroing of pk in case of failure remaining.
This change will make #713 harder, unfortunately, but we can cross that bridge when we get to it.
1cd4dd7 to
7535659
Compare
Ok, let me know what you think -- I've added the zero of |
79cbd83 to
3f1cb8a
Compare
There was a problem hiding this comment.
Thanks @jakemas! I think some zeroizations are still missing, see comments. Also, the implementation can be streamlined and improved to not leak which check has failed; we also need to document that the function leaks whether the sk was valid (not just in its return value, but in its operation).
3637256 to
4a13c70
Compare
Refactor keygen to use a new function that derives t0 t1 tr pk from rho s1 s2 so that this function can also be called by a utility function pk_to_sk that generates the pk given the sk. We also include ct_memcmp for constant time comparison. Signed-off-by: Jake Massimo <jakemas@amazon.com>
4a13c70 to
42479c1
Compare
In the very end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before thi commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that and declassify only the result on the memcmp which feels a bit more naturual and is easier to justify. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
In the very end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before thi commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that and declassify only the result on the memcmp which feels a bit more naturual and is easier to justify. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
In the very end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before thi commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that; that plus a constant-time selections allows us to not use any declassifications in verification, i.e., we do not leak the verification result through timing. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
In the very end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before thi commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that; that plus a constant-time selections allows us to not use any declassifications in verification, i.e., we do not leak the verification result through timing. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
In the very end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before thi commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that; that plus a constant-time selections allows us to not use any declassifications in verification, i.e., we do not leak the verification result through timing. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
In the very end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before thi commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that; that plus a constant-time selections allows us to not use any declassifications in verification, i.e., we do not leak the verification result through timing. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
In the very end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before thi commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that; that plus a constant-time selections allows us to not use any declassifications in verification, i.e., we do not leak the verification result through timing. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
At the end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before this commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that; that, plus a constant-time selection, allows us to not use message-dependent branches. For now, we still declassify the result of the verification itself to allow branching on it, e.g. in crypto_sign_open. This may be further improved in subsequent work, making crypto_sign_open constant flow and leaving a potential declassification of the verification result to the call-sites. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu>
At the end of verify one has to compare the input challenge to the re-computed challenge. If they are equal (and some previous checks on h and z passed), the signature is valid. Currently, our constant-time tests do not declassify the message and we, hence, need to declassify in this final step. Before this commit, the declassification would happen on the recomputed challenge just before the memcmp. Now that a constant-time memcmp was added in #714, we might as well use that; that, plus a constant-time selection, allows us to not use message-dependent branches. For now, we still declassify the result of the verification itself to allow branching on it, e.g. in crypto_sign_open. This may be further improved in subsequent work, making crypto_sign_open constant flow and leaving a potential declassification of the verification result to the call-sites. Signed-off-by: Matthias J. Kannwischer <matthias@kannwischer.eu> Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
| * XOR twice with s, separated by a value barrier, to prevent the compile | ||
| * from dropping the s computation in the loop. | ||
| */ | ||
| return (uint8_t)((mld_value_barrier_u32((uint32_t)r) ^ s) ^ s); |
There was a problem hiding this comment.
This should not have been merged. The value barrier is in the wrong place, allowing the compiler to elide s, and somehow the CT mask from mlkem-native was dropped.
Uh oh!
There was an error while loading. Please reload this page.