Skip to content

Support Falcon PADDED signature format#530

Merged
thomwiggers merged 28 commits intomasterfrom
sw-falcon-padding
Feb 28, 2024
Merged

Support Falcon PADDED signature format#530
thomwiggers merged 28 commits intomasterfrom
sw-falcon-padding

Conversation

@SWilson4
Copy link
Copy Markdown
Collaborator

@SWilson4 SWilson4 commented Dec 12, 2023

TODO: add a test for interoperability between the two formats (for crypto_sign and crypto_verify only).

Will mark as ready for review once CI is green and this test is added.

This PR adds support for the PADDED Falcon signature format.

For a quick refresher on the different formats, here's what the Falcon reference says:

127 * There are three formats for signatures:
128 *
129 * - COMPRESSED: this is the default format, which yields the shortest
130 * signatures on average. However, the size is variable (see below)
131 * though within a limited range.
132 *
133 * - PADDED: this is the compressed format, but with extra padding bytes
134 * to obtain a fixed size known at compile-time. The size depends only
135 * on the degree; the FALCON_SIG_PADDED_SIZE macro computes it. The
136 * signature process enforces that size by restarting the process
137 * until an appropriate size is obtained (such restarts are uncommon
138 * enough that the computational overhead is negligible).
139 *
140 * - CT: this is a fixed-size format, which furthermore allows
141 * constant-time processing with regard to the signature value and
142 * message data. This is meant for uncommon situations in which
143 * the signed data is secret but of low entropy, and the public key
144 * is not actually public. The CT format is larger than the
145 * COMPRESSED and PADDED formats.

It is possible to implement crypto_sign_verify to accept both COMPRESSED and PADDED formats, which is what I've attempted to do. (The reference code does this if the sig_type parameter is set to 0.) However, this doesn't work as well for non-detached signing/verification. The Falcon spec has a separate format for sm objects to be used in NIST KATs, but this is only defined for non-padded signatures. Rather than modify the COMPRESSED version's crypto_sign_open algorithm to accept PADDED signatures within this format (thereby going outside the spec), I decided to use a simple signature || message format for the PADDED version, since signatures were fixed length and we would be going outside the spec anyway.

KAT vectors were generated for the PADDED format by carefully modifying the reference code's KAT generation script. See https://github.com/SWilson4/falcon-padded-KATs for details.

Manually checked properties

  • Generated Github workflow (run .github/workflows/generate_workflows.py) (new schemes)
  • No stringification macros
  • Output-parameter pointers in functions are on the left
  • Negative return values on failure of API functions (within restrictions of FO transform).
  • variable declarations at the beginning (except in for (size_t i=...)
  • Optional:
    • All integer types are of fixed size, using stdint.h types (including uint8_t instead of unsigned char)
    • Integers used for indexing are of size size_t

@thomwiggers
Copy link
Copy Markdown
Member

This should just overwrite the current Falcon implementations, as those are attempting to implement the padded format already

@SWilson4
Copy link
Copy Markdown
Collaborator Author

Sorry for stealth-editing the description, @thomwiggers; I wanted to see the PR CI results yesterday but didn't have time to write things up properly. Didn't expect anyone to look at it so soon.

This should just overwrite the current Falcon implementations, as those are attempting to implement the padded format already

The current code already implements the compressed version modulo the value of the CRYPTO_BYTES macro, so that format was basically free. Also, the consensus in liboqs discussions was that it would be nice to support both compressed and padded formats, especially as NIST KATs are only provided for non-padded signatures. Is there a reason to prefer not having both in PQClean?

@thomwiggers
Copy link
Copy Markdown
Member

thomwiggers commented Dec 14, 2023

I'm mainly not sure exactly how well variable-length signature formats are supported by PQClean's tests... Even though the API includes siglen.

@SWilson4
Copy link
Copy Markdown
Collaborator Author

SWilson4 commented Dec 18, 2023

Pinging @dstebila to see how we want to approach this from the liboqs side: proceed with only fixed-length for now and perhaps include variable-length later? Revamp the PQClean tests if necessary?

@dstebila
Copy link
Copy Markdown
Member

Pinging @dstebila to see how we want to approach this from the liboqs side: proceed with only fixed-length for now and perhaps include variable-length later? Revamp the PQClean tests if necessary?

This PR adds Falcon*-padded, and also makes some changes to Falcon*, the latter of which is the compressed version, right? So is that compressed version variable length in this PR or not? If it is variable length, then can't we already tell from the CI tests whether PQClean tests have a problem with this?

@@ -4,7 +4,7 @@ type: signature
claimed-nist-level: 1
length-public-key: 897
length-secret-key: 1281
length-signature: 666
length-signature: 752
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange to me that the (even if just maximum) signature length of Falcon-compressed is larger than the padded format.

Copy link
Copy Markdown
Collaborator Author

@SWilson4 SWilson4 Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the padded format, signature generation retries until the signature size is below some fixed constant and then padded to this constant length. The constant is fixed such that the probability of a retry is fairly low. With the compressed format, there are no retries; you get what you get on the first iteration. Hence, a "compressed" signature can be longer than a padded signature but on average is shorter.

The "compressed" name is a bit of misnomer imo, since both padded and compressed version use the same format for representing data. Every padded signature is just a compressed signature with some extra 0s tacked on.

@thomwiggers
Copy link
Copy Markdown
Member

If it is variable length, then can't we already tell from the CI tests whether PQClean tests have a problem with this?

The thing is that accessing the bytes in the signature buffer beyond the length of the signature should probably be categorized as undefined behaviour. The thing with undefined behaviour is that there's a very non-zero chance that these bytes are actually the same each time so if we do something like memcmp(sig_a, sig_b, CRYPTO_BYTES), it still works. We should really go into the tests to check this.

@James-E-A
Copy link
Copy Markdown

James-E-A commented Jan 15, 2024

This PR adds Falcon*-padded, and also makes some changes to Falcon*, the latter of which is the compressed version, right?

Yeah, that looks like it is the case. I currently get a different length (always slightly less than CRYPTO_BYTES) each time I call crypto_sign_signature (and passing the wrong length leads to rejection, to boot); I'm using the latest master commit (1b7f6bf).


I have two dumb questions, though:

PADDED: this is the compressed format, but with extra padding bytes to obtain a fixed size known at compile-time.

Does that mean that these signatures are interoperable? Should a compliant verify() interface accept both "padded" and "compressed" signatures?

CT: this is a fixed-size format, which furthermore allows constant-time processing with regard to the signature value and message data. This is meant for uncommon situations in which the signed data is secret but of low entropy, and the public key is not actually public. The CT format is larger than the COMPRESSED and PADDED formats.

I assume this means (a) the CT format is not compatible with the others, and it would need its own entirely separate verify() function? And (b) that due to the "uncommon" comment by Falcon, it's not a high priority target for PQClean to add?

@SWilson4
Copy link
Copy Markdown
Collaborator Author

Yeah, that looks like it is the case. I currently get a different length (always slightly less than CRYPTO_BYTES) each time I call crypto_sign_signature (and passing the wrong length leads to rejection, to boot); I'm using the latest master commit (1b7f6bf).

I have two dumb questions, though:

PADDED: this is the compressed format, but with extra padding bytes to obtain a fixed size known at compile-time.

Does that mean that these signatures are interoperable? Should a compliant verify() interface accept both "padded" and "compressed" signatures?

The Falcon reference implementation verifies signatures of both formats using the same code. The caller can provide the format, or if none is provided the verification code infers it.

CT: this is a fixed-size format, which furthermore allows constant-time processing with regard to the signature value and message data. This is meant for uncommon situations in which the signed data is secret but of low entropy, and the public key is not actually public. The CT format is larger than the COMPRESSED and PADDED formats.

I assume this means (a) the CT format is not compatible with the others, and it would need its own entirely separate verify() function? And (b) that due to the "uncommon" comment by Falcon, it's not a high priority target for PQClean to add?

The CT format can be handled by the same verify function. In particular, CT-format signatures have a different header byte which allows for easy differentiation. See the falcon_verify_finish function in the reference code here: https://falcon-sign.info/impl/falcon.c.html

I don't think there's any active development going on in liboqs or PQClean to support the CT format, but it's probably in the "nice to have" category.

@SWilson4
Copy link
Copy Markdown
Collaborator Author

Sorry that this PR has been collecting dust for a month or so over the holidays. Now that some relevant downstream work in liboqs is ready to go, I'd like to revive it.

If it is variable length, then can't we already tell from the CI tests whether PQClean tests have a problem with this?

The thing is that accessing the bytes in the signature buffer beyond the length of the signature should probably be categorized as undefined behaviour. The thing with undefined behaviour is that there's a very non-zero chance that these bytes are actually the same each time so if we do something like memcmp(sig_a, sig_b, CRYPTO_BYTES), it still works. We should really go into the tests to check this.

How about this proposal for extending the tests to cover variable-length signatures:

  • Fill the signature buffer to capacity with random bytes prior to signing.
  • Check the tail of the buffer—the portion that should be unused, based on the returned signature length—to ensure that the random bytes have not been modified.

If we want to make sure that there are no reads beyond the valid signature length, then we could perhaps mark the signature buffer as initially "uninitialized" and use valgrind to check for behaviour that depends on it, as is done in liboqs to check for secret-dependent behaviour. This would be a bit more work but I imagine it would still be doable.

Would adding one or both of these tests help to alleviate your concerns about supporting variable-length Falcon signatures @thomwiggers?

@thomwiggers
Copy link
Copy Markdown
Member

How about this proposal for extending the tests to cover variable-length signatures:

  • Fill the signature buffer to capacity with random bytes prior to signing.
  • Check the tail of the buffer—the portion that should be unused, based on the returned signature length—to ensure that the random bytes have not been modified.

This sounds like a good test.

If we want to make sure that there are no reads beyond the valid signature length, then we could perhaps mark the signature buffer as initially "uninitialized" and use valgrind to check for behaviour that depends on it, as is done in liboqs to check for secret-dependent behaviour. This would be a bit more work but I imagine it would still be doable.

Would adding one or both of these tests help to alleviate your concerns about supporting variable-length Falcon signatures @thomwiggers?

I also like this a lot, though there is of course a bit of overhead when we run Valgrind --- we currently run the functest test suite with ASAN and Valgrind already. How exactly does this "mark as uninitialized" work?

@SWilson4
Copy link
Copy Markdown
Collaborator Author

SWilson4 commented Jan 24, 2024

How about this proposal for extending the tests to cover variable-length signatures:

  • Fill the signature buffer to capacity with random bytes prior to signing.
  • Check the tail of the buffer—the portion that should be unused, based on the returned signature length—to ensure that the random bytes have not been modified.

This sounds like a good test.

If we want to make sure that there are no reads beyond the valid signature length, then we could perhaps mark the signature buffer as initially "uninitialized" and use valgrind to check for behaviour that depends on it, as is done in liboqs to check for secret-dependent behaviour. This would be a bit more work but I imagine it would still be doable.
Would adding one or both of these tests help to alleviate your concerns about supporting variable-length Falcon signatures @thomwiggers?

I also like this a lot, though there is of course a bit of overhead when we run Valgrind --- we currently run the functest test suite with ASAN and Valgrind already. How exactly does this "mark as uninitialized" work?

Valgrind provides a VALGRIND_MAKE_MEM_UNDEFINED macro. We use it when testing liboqs to mark all output of randombytes as undefined, thereby (hopefully) detecting behaviour that depends on random (i.e., secret) data. For this application I believe it would be much simpler, as it would only be a matter of marking one buffer as undefined, rather than intercepting all calls to randombytes.

Here's the liboqs usage.

@SWilson4 SWilson4 marked this pull request as ready for review February 20, 2024 22:06
@James-E-A
Copy link
Copy Markdown

James-E-A commented Feb 22, 2024

Am I interpreting this PR correctly?:

  1. The Compressed format will remain the "default" implementation with an unqualified name.
  2. The Padded format is being added, under a new interface.
  3. The Compressed sign_verify() is being augmented to support Padded signatures reliably, without accidental UB.
  4. The Padded sign_verify() will also reliably support verifying Compressed signatures.
  5. The CT format is not supported in any capacity, even verification, at this time. Attempts to verify a CT signature will reliably fail without segfaults or out-of-bounds access.

@SWilson4
Copy link
Copy Markdown
Collaborator Author

Am I interpreting this PR correctly?:

1. The Compressed format will remain the "default" implementation with an unqualified name.

2. The Padded format is being added, under a new interface.

3. The Compressed `sign_verify()` is being augmented to support Padded signatures _reliably_, without accidental UB.

4. The Padded `sign_verify()` will also reliably support verifying Compressed signatures.

5. The `CT` format is not supported in _any_ capacity, even verification, at this time. Attempts to verify a CT signature will _reliably_ fail without segfaults or out-of-bounds access.

Yes indeed, that's an accurate interpretation.

@SWilson4
Copy link
Copy Markdown
Collaborator Author

SWilson4 commented Feb 23, 2024

This is now passing all CI. Notes to make code review easier:

  • The falcon-*-padded code is identical (up to namespacing) to the existing Falcon code, with the exception of pqclean.c, which implements padded format signing and verification.
  • A check for interoperability between Falcon formats has been added to functest. This required a fair bit of hacking around with preprocessor directives and Makefiles.
  • The falcon tests hung in CI for armhf / gcc. I had to disable quite a few in order to get them to complete, including functest and nistkat. :/ This is less than ideal, but looking back at the logs, it seems like the tests have been timing out for a year, since the latest version of Falcon was added, and the workflow has never actually successfully completed.
  • The Windows and MacOS version of astyle were incompatible with the Linux version; when one was satisfied the other would fail. I disabled style checking except on Linux for CI tests.

@SWilson4
Copy link
Copy Markdown
Collaborator Author

Since I skipped a number of hanging CI tests, I thought it best to pre-integrate the new code into liboqs to take advantage of its CI (which includes running on an actual ARM machine). https://github.com/open-quantum-safe/liboqs/commits/sw-falcon-padded/

@thomwiggers
Copy link
Copy Markdown
Member

Yeah, I think the skipping of tests is unfortunate but probably required. PQClean could use a restructuring of its CI, and we would ideally figure out how to get proper ARM-based tests going (on actual hardware, with accelerators).

@thomwiggers thomwiggers merged commit 7707d1b into master Feb 28, 2024
@thomwiggers thomwiggers deleted the sw-falcon-padding branch February 28, 2024 08:57
tniessen added a commit to tniessen/node-pqclean that referenced this pull request Feb 28, 2024
data-wardenb6ym added a commit to data-wardenb6ym/node-pqclean that referenced this pull request Sep 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants