Skip to content

Update Kyber poly_tomsg to fix timing leak (w/ -Os)#320

Merged
rpls merged 1 commit intomasterfrom
kybertimingleak
Dec 19, 2023
Merged

Update Kyber poly_tomsg to fix timing leak (w/ -Os)#320
rpls merged 1 commit intomasterfrom
kybertimingleak

Conversation

@mkannwischer
Copy link
Copy Markdown
Contributor

This (partially) addresses #319.

The function poly_tomsg from the reference implementation of Kyber (which was copied into the M4-optimized implementations) would result in a variable-time udiv instruction operating on secret data when compiled with gcc using -Os. I tried a couple of versions from gcc 11 to gcc 13, but did not see any difference.

This commit updates the m4-specific code to use the patch from pq-crystals/kyber@dda29cc. Note that the code in PQClean has not yet been updated and hence the clean implementation within pqm4 is still vulnerable.

This (partially) addresses #319.

The function poly_tomsg from the reference implementation of Kyber
(which was copied into the M4-optimized implementations) would result
in a variable-time udiv instruction operating on secret data when compiled
with gcc using -Os. I tried a couple of versions from gcc 11 to gcc 13,
but did not see any difference.

This commit updates the m4-specific code to use the patch from
pq-crystals/kyber@dda29cc.
Note that the code in PQClean has not yet been updated and hence the
clean implementation within pqm4 is still vulnerable.
@rpls
Copy link
Copy Markdown
Contributor

rpls commented Dec 19, 2023

No difference between the GCC 11-13, with OPT_SIZE=1, the code uses a normal multiplication, without size optimization, the inner loop is unrolled and the multiplication implemented with rsb instructions.

@rpls rpls merged commit d98a162 into master Dec 19, 2023
smuellerDD added a commit to smuellerDD/leancrypto that referenced this pull request Jan 25, 2024
This is a port of issue PQClean/PQClean#534

Description from mupq/pqm4#320:

The function poly_tomsg from the reference implementation of Kyber
(which was copied into the M4-optimized implementations) would result
in a variable-time udiv instruction operating on secret data when
compiled with gcc using -Os. I tried a couple of versions from gcc 11
to gcc 13, but did not see any difference.

Description from mupq/pqm4#319

This bit of code or similar is used in your various implementations of
Kyber to compress a polynomial ring element into a (secret) message:

```
 t = (((a->coeffs[8 * i + j] << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
```

To do so, it performs a division by Q that might not necessarily compile
to a multiplication instruction: looking at the output of some C
compilers using https://godbolt.org/z/sKn3TKKGq and
https://godbolt.org/z/8GqKoTfYh for example, a division instruction is
emitted even when -O3 is specified. Should a division instruction be
emitted, its execution time would likely be variable and leak
information about its secret input.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
@mkannwischer mkannwischer deleted the kybertimingleak branch April 16, 2024 01:24
infra-riderju74 added a commit to infra-riderju74/leancrypto that referenced this pull request Sep 28, 2025
This is a port of issue PQClean/PQClean#534

Description from mupq/pqm4#320:

The function poly_tomsg from the reference implementation of Kyber
(which was copied into the M4-optimized implementations) would result
in a variable-time udiv instruction operating on secret data when
compiled with gcc using -Os. I tried a couple of versions from gcc 11
to gcc 13, but did not see any difference.

Description from mupq/pqm4#319

This bit of code or similar is used in your various implementations of
Kyber to compress a polynomial ring element into a (secret) message:

```
 t = (((a->coeffs[8 * i + j] << 1) + KYBER_Q / 2) / KYBER_Q) & 1;
```

To do so, it performs a division by Q that might not necessarily compile
to a multiplication instruction: looking at the output of some C
compilers using https://godbolt.org/z/sKn3TKKGq and
https://godbolt.org/z/8GqKoTfYh for example, a division instruction is
emitted even when -O3 is specified. Should a division instruction be
emitted, its execution time would likely be variable and leak
information about its secret input.

Signed-off-by: Stephan Mueller <smueller@chronox.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants