Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2902 +/- ##
==========================================
- Coverage 78.11% 78.08% -0.03%
==========================================
Files 679 682 +3
Lines 117949 118273 +324
Branches 16599 16606 +7
==========================================
+ Hits 92130 92348 +218
- Misses 24930 25037 +107
+ Partials 889 888 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This reverts commit fd8991a.
2978380 to
07f1b84
Compare
|
Opened awslabs/aws-lc-verification#180 for the SAW proof wrapper function changes. |
crypto/fipsmodule/ml_dsa/README.md
Outdated
|
|
||
| ## Testing | ||
|
|
||
| We KAT ML-DSA with test vectors obtained from https://github.com/post-quantum-cryptography/KAT within `PQDSAParameterTest.KAT`. We select the KATs for the signing mode `hedged`, which derives the signing private random seed (rho) pseudorandomly from the signer's private key, the message to be signed, and a 256-bit string `rnd` which is generated at random. The `pure` variant of these KATs were used, as they provide test vector inputs for "pure" i.e., non-pre-hashed messages. The KAT files have been modified to insert linebreaks between each test vector set. |
There was a problem hiding this comment.
| We KAT ML-DSA with test vectors obtained from https://github.com/post-quantum-cryptography/KAT within `PQDSAParameterTest.KAT`. We select the KATs for the signing mode `hedged`, which derives the signing private random seed (rho) pseudorandomly from the signer's private key, the message to be signed, and a 256-bit string `rnd` which is generated at random. The `pure` variant of these KATs were used, as they provide test vector inputs for "pure" i.e., non-pre-hashed messages. The KAT files have been modified to insert linebreaks between each test vector set. | |
| We test ML-DSA with Known Answer Test (KAT) vectors obtained from https://github.com/post-quantum-cryptography/KAT within `PQDSAParameterTest.KAT`. We select the KATs for the signing mode `hedged`, which derives the signing private random seed (rho) pseudorandomly from the signer's private key, the message to be signed, and a 256-bit string `rnd` which is generated at random. The `pure` variant of these KATs were used, as they provide test vector inputs for "pure" i.e., non-pre-hashed messages. The KAT files have been modified to insert linebreaks between each test vector set. |
| To test ML-DSA pure, non-deterministic mode, we use `tgId = 19, 21, 23` of sigGen and `tgId = 7, 9, 11` of sigVer. | ||
| To test ML-DSA ExternalMu, non-deterministic mode, we use `tgId = 20, 22, 24` of sigGen and `tgId = 8, 10, 12` of sigVer. | ||
|
|
||
| The test suite includes: |
There was a problem hiding this comment.
Does it include testing the deterministic functions? Are these included in the KATs? The ACVP tests are not KATs? They test the non-deterministic version with random data?
There was a problem hiding this comment.
This is referring to the deterministic variant of ML-DSA (as opposed to deterministically testing the APIs). We do not test the deterministic mode as we do not support the ML-DSA deterministic mode. Internally the API is available (we note this here:
aws-lc/crypto/fipsmodule/ml_dsa/mldsa/sign.c
Lines 709 to 710 in d7cd042
There was a problem hiding this comment.
This is referring to the deterministic variant of ML-DSA
Do you mean "non-deterministic"? If so, I'd like to know how it's tested in ACVP, with KATs or random data making sure it verifies the signature? If KATs, how are they set up to bypass randomness generation?
There was a problem hiding this comment.
There are two versions of ml-dsa, deterministic and non-deterministic, we only implement the non-deterministic variant. To test the non-deterministic variant we do it two ways, one using the ACVP test vectors, and one using the KATs. In both cases, we use the internal APIs which allow the tester to provide the randomness for each signature via the provided seed argument. So there is no need to bypass the random generation, we use the provided seeds and supply them directly the the API to get the expected known result.
crypto/fipsmodule/ml_dsa/README.md
Outdated
| - FIPS 204 compliance: Introduced optional PCT (FIPS 204, Section 4.4, Pairwise | ||
| Consistency) and zeroization of stack buffers as required by (FIPS 204, | ||
| Section 3.6.3, Destruction of intermediate values). | ||
| - Introduction of native backend implementations for AVX2. Those are drop-in |
There was a problem hiding this comment.
Where is the introduction implemented?
There was a problem hiding this comment.
Good catch -- not added in this PR now. Removed this line in afa0b3b
| #if !defined(__ASSEMBLER__) | ||
| #include "../../internal.h" | ||
|
|
||
| // Define inline before mldsa-native headers are included |
There was a problem hiding this comment.
I believe some of these definitions up until line 31 are in sys.h in the case of mlkem. Will they be put there when the assembly back end is imported? It doesn't seem to me they were moved in PR#2903.
There was a problem hiding this comment.
You are correct, they are already in sys.h
aws-lc/crypto/fipsmodule/ml_dsa/mldsa/sys.h
Lines 118 to 136 in 79c9a4c
I removed the duplicate definitions in 79c9a4c
| #define MLD_CONFIG_INTERNAL_API_QUALIFIER static __attribute__((unused)) | ||
| #define MLD_CONFIG_EXTERNAL_API_QUALIFIER static __attribute__((unused)) |
There was a problem hiding this comment.
This was added to prevent errors regarding the unused APIs -- unlike ML-KEM there are some additional API options for ML-DSA. Many of the internals are shared, so there isn't much redundant code, but I have opened an issue upstream pq-code-package/mldsa-native#740 to add to the import options to be more specific about exact APIs that are included.
aws-lc/crypto/fipsmodule/ml_dsa/mldsa/sign.c:1053:5: error: unused function 'mldsa87_verify_pre_hash_shake256' [-Werror,-Wunused-function]
| static MLD_INLINE int mld_sys_check_capability(int cap) | ||
| { | ||
| #if defined(MLD_SYS_X86_64) | ||
| if (cap == 1) // MLD_SYS_CAP_AVX2 |
There was a problem hiding this comment.
Related to the previous question, but also in PR#2903, it doesn't seem like the macro was used.
There was a problem hiding this comment.
I've aligned crypto/fipsmodule/ml_dsa/mldsa_native_config.h to look more like crypto/fipsmodule/ml_kem/mlkem_native_config.h in 79c9a4c and removed
|
|
||
| // Map memset function to the one used by AWS-LC | ||
| #define MLD_CONFIG_CUSTOM_MEMSET | ||
| #if !defined(__ASSEMBLER__) && !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) |
There was a problem hiding this comment.
| #if !defined(__ASSEMBLER__) && !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) | |
| #if !defined(__ASSEMBLER__) |
as in mlkem_native_config.h
There was a problem hiding this comment.
I've aligned crypto/fipsmodule/ml_dsa/mldsa_native_config.h to look more like crypto/fipsmodule/ml_kem/mlkem_native_config.h in 79c9a4c and removed.
|
|
||
| // Map memcpy function to the one used by AWS-LC | ||
| #define MLD_CONFIG_CUSTOM_MEMCPY | ||
| #if !defined(__ASSEMBLER__) && !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) |
There was a problem hiding this comment.
| #if !defined(__ASSEMBLER__) && !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) | |
| #if !defined(__ASSEMBLER__) |
as in mlkem_native_config.h
There was a problem hiding this comment.
I've aligned crypto/fipsmodule/ml_dsa/mldsa_native_config.h to look more like crypto/fipsmodule/ml_kem/mlkem_native_config.h in 79c9a4c and removed.
Resolved conflict: kept deletion of crypto/fipsmodule/ml_dsa/ml_dsa_ref/packing.c
- Removed inline macro definitions from config (now only in sys.h) - Fixed capability function to use mld_sys_cap enum instead of int - Added sys.h includes to all functions needing MLD_INLINE - Removed MLD_CONFIG_MULTILEVEL_NO_SHARED guards from memcpy/memset - Kept __attribute__((unused)) for API qualifiers (needed for ML-DSA) Addresses code review comments about aligning with ML-KEM structure.
| #if defined(BORINGSSL_FIPS_BREAK_TESTS) | ||
| #define MLD_CONFIG_KEYGEN_PCT_BREAKAGE_TEST | ||
| #if !defined(__ASSEMBLER__) && !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) | ||
| #if !defined(__ASSEMBLER__) |
There was a problem hiding this comment.
I'm a bit confused about !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED). It's sometimes included and sometimes not in crypto/fipsmodule/ml_kem/mlkem_native_config.h. I understand its purpose in multilevel build, but here, I don't know why it's sometimes included and sometimes not. Can you explain when it needed to be included and when not in mlkem_native_config.h
There was a problem hiding this comment.
The !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) guard is needed for parameter-set-specific functions but not for generic utility functions.
The guard is needed for mlk_break_pct, mlk_zeroize, and mlk_randombytes because these functions could potentially have parameter-set-specific behavior. In multilevel builds, without the guard, these would be defined multiple times (once per parameter set like MLKEM512/768/1024), which would cause "duplicate symbol" linker errors. The guard ensures they're only defined once in the shared portion of the multilevel build.
The guard is not needed for mlk_memcpy and mlk_memset because these are generic utility functions with identical implementations across all parameter sets. They don't depend on parameter set constants (K, L, ETA, etc.), so they can safely be defined in every compilation unit without conflicts. A simpler guard (!defined(__ASSEMBLER__) only) is sufficient for these functions.
In summary, use the guard for functions that could vary by parameter set or would create duplicate definitions in multilevel builds. We don't use it for truly generic utilities that are identical across all parameter sets.
There was a problem hiding this comment.
Thank you for the satisfactory explanation. However, I am not sure how mld_zeroize, mld_break_pct and mld_randombytes are parameterised. This can wait for a subsequent PR but is to be checked in the case of mlkem as well.
There was a problem hiding this comment.
The !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) guard is needed for parameter-set-specific functions but not for generic utility functions.
It's the opposite, see the documentation:
* If this is set, no MLK_CONFIG_PARAMETER_SET-independent code
* will be included in the build.
So, where the implementation uses level-independent code, that's guarded by !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED).
The config is a special case, because that's fixes across the 'stages' of a monobuild: If you build 512, then 768, then 1024 -- as we do -- then the config will be included once at the beginning, and not again thereafter. The assumption is that you have a consistent configuration, but vary only the level.
That being so, it doesn't matter if you add !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) to the config or not; the code will only be used once in a context where !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED) holds anyway.
I agree this is confusing, and am open to ideas for how we can improve it.
For now, I would suggest that AWS-LC's config does consistently not use !defined(MLD_CONFIG_MULTILEVEL_NO_SHARED). We should also change this for the mlkem-native config used by AWS-LC.
crypto/fipsmodule/ml_dsa/mldsa/sys.h
Outdated
| * __attribute__((unused)). | ||
| */ | ||
|
|
||
| /* Do not use inline for C90 builds*/ |
There was a problem hiding this comment.
I think you need the latest version of mldsa-native, because I see a difference in the definitions of MLD_INLINE
There was a problem hiding this comment.
ok, ran the import script again to pick up newest changes upstream: d37774e
This ensures that parameter-set-specific functions (mld_break_pct, mld_zeroize, mld_randombytes) are only defined once in multilevel builds, while generic utilities (mld_memcpy, mld_memset) remain available. Addresses code review comment about consistency with mlkem_native_config.h
Issues:
Related PRs:
Import mldsa-native
This imports mldsa-native (https://github.com/pq-code-package/mldsa-native) into AWS-LC.
This PR focuses on the minimal configuration of mldsa-native: No assembly and no FIPS-202 code are imported.
mldsa-native is a high-performance, high-assurance C90 implementation of ML-DSA developed under the Post-Quantum Cryptography Alliance (PQCA) and the Linux Foundation. It is a fork of the Dilithium reference implementation.
Import Mechanism
The mldsa-native source code is unmodified and imported using the importer script
crypto/fipsmodule/ml_dsa/importer.sh;the details of the import are in META.yml.A custom config is provided for mldsa-native which in particular includes a small 'compatibility layer' between AWS-LC/OpenSSL and mldsa-native -- see below.
Future imports (C-only)
Future updates of the C-only mldsa-native source tree should happen through a re-import of mldsa-native: That is, (a) delete
crypto/fipsmodule/ml_dsa/mldsaand (b) re-run import.sh. This will re-importmldsa-native/main, though you can set theGITHUB_SHAandGITHUB_REPOSITORYenvironment variables to point to any other mldsa-native repository/fork.Future imports (native code)
Once we have verified meaningful parts of the mldsa-native assembly backends, PRs will be filed to integrate those. The details for this integration are TBD and not necessary to finalize for this PR. The options are (a) extending import.sh to import larger parts of the mldsa-native upstream source tree, including native backends, (b) writing custom backends, backed by sources living in the s2n-bignum source tree. Both is possible and compatible with this PR.
Import Scope
mldsa-native has a C-only version as well as native 'backends' in AVX2 and Neon for high performance. This commit only imports the C-only version. Integration of native backends will be done separately.
mldsa-native offers its own FIPS-202 implementation, including fast versions of batched FIPS-202. However, this commit does not import those, but instead provides glue-code around AWS-LC's own FIPS-202 implementation. The path to leveraging the FIPS-202 performance improvements in mldsa-native would be to integrate them directly into
crypto/fipsmodule/sha.Impact on build
None. No build-files are modified. The multilevel build process remains unchanged.
Internal API changes
3 Removed functions:
Compatibility layer
The configuration file
mldsa_native_config.hincludes a compatibility layer between AWS-LC/OpenSSL and mldsa-native, covering:MLD_CONFIG_KEYGEN_PCTis enabled to include a PCT.BORINGSSL_FIPS_BREAK_TESTSis set,MLD_CONFIG_KEYGEN_PCT_BREAKAGE_TESTis set andmld_break_pctdefined viaboringssl_fips_break_test("MLDSA_PWCT"), to include runtime-breakage of the PCT for testing purposes.BORINGSSL_CONSTANT_TIME_VALIDATIONis set, thenMLD_CONFIG_CT_TESTING_ENABLEDis set to enable valgrind testing.MLD_CONFIG_CUSTOM_ZEROIZEis set andmld_zeroizemapped toOPENSSL_cleanseto use OpenSSL's zeroization function.MLD_CONFIG_CUSTOM_RANDOMBYTESis set andmld_randombytesmapped toRAND_bytesto use AWS-LC's randombytes function.Side-channels
mldsa-native's CI uses a patched version of valgrind to check for various compilers and compile flags that there are no secret-dependent memory accesses, branches, or divisions. The relevant assertions have been kept but are unused unless
MLD_CONFIG_CT_TESTING_ENABLEDis set, which is the case if and only ifBORINGSSL_CONSTANT_TIME_VALIDATIONis set.mldsa-native uses value barriers to block potentially harmful compiler reasoning and optimization. Where standard gcc/clang inline assembly is not available, mldsa-native falls back to a slower 'opt blocker' based on a volatile global -- both are described in ct.h.
Formal Verification
All C-code imported in this commit is formally verified using the C Bounded Model Checker (CBMC) to be free of various classes of undefined behaviour, including out-of-bounds memory accesses and arithmetic overflow; the latter is of particular interest for ML-DSA because of the use of lazy modular reduction for improved performance.
The heart of the CBMC proofs are function contract and loop annotations to the C-code. Function contracts are denoted
__contract__(...)clauses and occur at the time of declaration, while loop contracts are denoted__loop__and follow the for statement.The function contract and loop statements are kept in the source, but removed by the preprocessor so long as the CBMC macro is undefined. Keeping them simplifies the import, and care has been taken to make them readable to the non-expert, and thereby serve as precise documentation of assumptions and guarantees upheld by the code.
FIPS Compliance
mldsa-native unconditionally includes stack zeroization. mldsa-native's default secure memset is replaced by
OPENSSL_cleanse.mldsa-native conditionally includes a PCT, guarded by
MLD_CONFIG_KEYGEN_PCT. This is set in the config if and only ifAWSLC_FIPSis set.While not part of the FIPS standard, the
pk_from_skfunction includes validation of both t0 (low-order bits) and tr (hash of public key) using constant-time comparison functions (mld_ct_memcmp), providing strong assurance of key consistency.Testing
We KAT ML-DSA with test vectors obtained from https://github.com/post-quantum-cryptography/KAT within
PQDSAParameterTest.KAT. We select the KATs for the signing modehedged, which derives the signing private random seed (rho) pseudorandomly from the signer's private key, the message to be signed, and a 256-bit stringrndwhich is generated at random. Thepurevariant of these KATs were used, as they provide test vector inputs for "pure" i.e., non-pre-hashed messages.We also run the ACVP test vectors obtained from https://github.com/usnistgov/ACVP-Server within the three functions
PerMLDSATest.ACVPKeyGen,PerMLDSATest.ACVPSigGenandPerMLDSATest.ACVPSigVer. These correspond to the tests found at ML-DSA-keyGen-FIPS204, ML-DSA-sigGen-FIPS204, and ML-DSA-sigVer-FIPS204.To test ML-DSA pure, non-deterministic mode, we use
tgId = 19, 21, 23of sigGen andtgId = 7, 9, 11of sigVer.To test ML-DSA ExternalMu, non-deterministic mode, we use
tgId = 20, 22, 24of sigGen andtgId = 8, 10, 12of sigVer.Test Results:
Formatting
Code in
crypto/fipsmodule/ml_dsa/mldsais directly imported from mldsa-native and comes with its owncrypto/fipsmodule/ml_dsa/mldsa/.clang-format.Prefix build
The prefix build should not be affected by the import, since no definitions of external linkage are imported (everything is tagged either static directly, or
MLD_EXTERNAL_APIorMLD_INTERNAL_API, both of which are set to static in the context of the import, too).Performance
Performance should be comparable to the previous integration as both are based on C-only code with AWS-LC's FIPS-202 implementation. The fast mldsa-native backends are not yet imported.
Multilevel build
At the core, mldsa-native is currently a 'single-level' implementation of ML-DSA: A build of the main source tree provides an implementation of exactly one of ML-DSA-44/65/87, depending on the
MLD_CONFIG_PARAMETER_SETparameter.To build all security levels, level-specific sources are built 3 times, once per security level, and linked with a single build of the level-independent code. The single-compilation-unit approach pursued by AWS-LC makes this process fairly simple since one merely needs to include the single-compilation-unit file provided by mldsa-native three times, and configure it so that the level-independent code is included only once. The final include moreover #undef'ines all macros defined by mldsa-native, reducing the risk of name clashes with other parts of
crypto/fipsmodule/bcm.c.Note that this process is entirely internal to ml_dsa.c, and does not affect the AWS-LC build.
HashML-DSA: mldsa-native includes lots of HashML-DSA functionality that we dont need in aws-lc. Perhaps we should add config upstream to mldsa-native to choose which of pure/externalmu/hash modes are imported to reduce unused code.
Main differences from reference implementation
mldsa-native is a fork of the ML-DSA reference implementation (Dilithium).
The following gives an overview of the major changes:
License
mldsa-native (everything under
crypto/fipsmodule/ml_dsa/mldsa/**) is imported under the Apache 2.0 license and the ISC license. The LICENSE file remains unchanged.Integration-specific code (everything with direct parent
crypto/fipsmodule/ml_dsa/*) is made under the terms of the Apache 2.0 license and the ISC license.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.