Skip to content

ML-KEM refactor to support all three variants and some code cleanup#26054

Closed
vdukhovni wants to merge 6 commits intoopenssl:feature/ml-kemfrom
vdukhovni:ml-kem-wip2
Closed

ML-KEM refactor to support all three variants and some code cleanup#26054
vdukhovni wants to merge 6 commits intoopenssl:feature/ml-kemfrom
vdukhovni:ml-kem-wip2

Conversation

@vdukhovni
Copy link

@vdukhovni vdukhovni commented Nov 25, 2024

Multi-variant ML-KEM

This introduces support for ML-KEM-512 and ML-KEM-1024 using the same
underlying implementation parameterised by a few macros for the
associated types and constants.

KAT tests are added for ML-KEM 512 and 1024, to complement the previous
tests for ML-KEM-768.

MLKEM TLS group codepoints are updated to make the additional KEMs
known to the TLS layer (ML-KEM is not enabled by default).

Most of the underlying code is shared, but the external
generate/encap/decap API is variant-specific, and uses
some implicit buffer sizes and dedicated key types.

Variant-specific) code is compiled separately for each of ML-KEM-512,
ML-KEM-768 and ML-KEM-1024 from a common source file, which contains
short wrappers for the shared code.

There is an additional API that takes an "const ossl_ml_kem_vinfo"
pointer, (obtained from ossl_ml_kem_*_get_vinfo()). This checks the
otherwise implicit sizes and uses (void *) key pointers, converting them
to the appropriate type before calling the right variant-specific API.
This is used to simplify the code in the provider, which other than
initialising the right variant parameters is then share across all the
variants.

The original BoringSSL code was refactored to eliminate the opaque
public/private key structure wrappers, since these structures will be
private to the provider implementation, and are not part of the public
API.

Support for ML-KEM-512 required adding a centered binomial distribution
helper function to deal with η_1 == 3 in that algorithm.

There were some unrelated literal 32's and 64's in the code, these were
replaced by the appropriate macros.

Some additional comments were added to highlight how the code relates to
the ML-KEM specification in FIPS 203.

Some tests were added to excercise also the non-derandomised ML-KEM
APIs, but so far the "vinfo" driven functions above the provider level,
the provider-layer (end-to-end) non-derandomised tests are yet to be
written, but should be fairly tractable.

@vdukhovni vdukhovni added the style: waived exempted from style checks label Nov 25, 2024
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@vdukhovni vdukhovni removed the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@vdukhovni vdukhovni removed the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@vdukhovni vdukhovni removed the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@vdukhovni vdukhovni removed the severity: fips change The pull request changes FIPS provider sources label Nov 25, 2024
@t8m t8m added approval: review pending This pull request needs review by a committer triaged: feature The issue/pr requests/adds a feature tests: deferred Tests will be added in a subsequent PR (label should be removed when the PR with tests is merged) labels Nov 25, 2024
Copy link
Contributor

@paulidale paulidale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

Copy link
Author

@vdukhovni vdukhovni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Any idea why the CI bot keeps wanting the PR as affecting FIPS. I don't see any changes that go into the FIPS module.

@paulidale paulidale added hold: needs tests The PR needs tests to be added to it hold: needs documentation The PR needs documentation to be added to it labels Nov 25, 2024
@vdukhovni
Copy link
Author

The KAT tests for ML-KEM-512 and ML-KEM-1024 are I think on @baentsch's plate. There should soon be another PR that is based on this one, and integrates the algorithms into the provider and adds KAT tests.

Michael and I are on the hook for docs, I think I'm doing reviews and polish, but I'll check whether I'm also authoring any.

@paulidale paulidale removed the hold: needs tests The PR needs tests to be added to it label Nov 25, 2024
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 26, 2024
@vdukhovni vdukhovni removed the severity: fips change The pull request changes FIPS provider sources label Nov 26, 2024
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 26, 2024
@vdukhovni vdukhovni removed the severity: fips change The pull request changes FIPS provider sources label Nov 26, 2024
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Nov 26, 2024
@paulidale
Copy link
Contributor

The style guide section 5 says upper case for type names. Many such here are not.

* First precomputed array from Appendix A of FIPS 203, or else Python:
* kNTTRoots = [pow(17, bitreverse(i), p) for i in range(128)]
*/
static const uint16_t kNTTRoots[128] = {
Copy link
Contributor

@paulidale paulidale Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style guide section 4 advises against mixed case names and type based naming.

The names here are readable enough so I guess this isn't a mandatory change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inherited from Google's C/C++ internal style guide, and so not surprising in code contributed from BoringSSL. If this is important to change, that's certainly an option. The names in question are file-local.

* SHA3-256, then |outlen| must be the correct length for that function.
*/
static __owur
int single_keccak(uint8_t *out, size_t outlen, const uint8_t *in, size_t inlen,
Copy link
Contributor

@paulidale paulidale Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The style guide section 8 recommends DOxygen comments for functions. This was specifically about internal functions -- public ones are documented in POD files.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How internal did you have in mind here? These are static internal, so never seen outside the file in question.

* Rank mismatch should not happen, distinct ML-KEM variants have separate
* dispatch methods.
*/
if (v1->rank != v2->rank)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1 != v2 would also work..

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am now comparing public key SHA3-256 hashes, which are the same size across all variants. So any rank mismatch will also be detected, barring catastrophic failure of collision resistance in SHA3-256.

The more interesting question here is whether two "bare" keys (with just the "vinfo" and hash handles, no private or public key material as-yet generated or imported) are to be considered "equal". The ML-KEM implementation returns no, which makes sense to me, like SQL NULL values, we simply don't know whether the keys are really intended to be the same, and "no" seems like the safer answer.

No tests seem to expect equality of "bare" keys...

* https://www.openssl.org/source/license.html
*/

/* Copyright (c) 2024, Google Inc. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever we decide to do with attributing for google, it probably needs to be consistent across files.

Copy link
Member

@slontis slontis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the main advantage of templating (macroing) the code is that it avoids having to do a few mallocs. The cost is reduced readability, in what could be argued is already complex enough (considering that there is not really that much base code).

I see little reason to do this for the ML_DSA case which has quite a few more functions.

Storing a matrix and vector as just pointers to a scalar and then having init functions that sets the scalar ptr is much simpler.
When a key is allocated it mallocs enough space for all its vectors and matrix scalars and then just sets the scalar pointer to point into the malloced blob.
The same thing can be done for temporary stack vectors and matricies by either allocating a scalar blob and/or having a stack based scalar blob and then setting the temp scalar pointers to point to the blob.

@vdukhovni
Copy link
Author

Storing a matrix and vector as just pointers to a scalar and then having init functions that sets the scalar ptr is much simpler. When a key is allocated it mallocs enough space for all its vectors and matrix scalars and then just sets the scalar pointer to point into the malloced blob.

Done in #26172

@vdukhovni
Copy link
Author

The style guide section 5 says upper case for type names. Many such here are not.

In #26172, none remain that are not file-local. Just two are exported by ml_kem.h:

  • ML_KEM_KEY
  • ML_KEM_VINFO

So progress has been made along the lines you suggest, but I am reluctant to avoid short names that private to a given module.

@vdukhovni
Copy link
Author

Superseded by #26172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval: review pending This pull request needs review by a committer hold: needs documentation The PR needs documentation to be added to it severity: fips change The pull request changes FIPS provider sources style: waived exempted from style checks tests: present The PR has suitable tests present triaged: feature The issue/pr requests/adds a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants