Wrap pointers to s2n-bignum functions - delocator fix#2165
Conversation
so that the delocator does not think they're local_target while they have :got: references.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2165 +/- ##
==========================================
- Coverage 78.97% 78.97% -0.01%
==========================================
Files 611 611
Lines 105552 105569 +17
Branches 14950 14949 -1
==========================================
+ Hits 83361 83373 +12
- Misses 21538 21542 +4
- Partials 653 654 +1 ☔ View full report in Codecov by Sentry. |
|
Confirmed CI is now passing with this change applied on #2148 |
|
Created #2166 to test this fix. |
|
Confirmed that FIPS static builds are all passing on #2166 |
|
Merging this will unblock: #2148 which are required for FIPS compliance. |
| // The wrapper functions are needed for FIPS static build. | ||
| // Otherwise, initializing ec_nistp_meth with pointers to s2n-bignum | ||
| // functions directly generates :got: references that are also thought | ||
| // to be local_target by the delocator. |
There was a problem hiding this comment.
What is the difference between defining this wrapper function and how s2n-bignum defines bignum_add_p384? Is my mental model of C functions are equivalent to functions defined in the assembly files wrong?
There was a problem hiding this comment.
It's a valid question. This change stemmed from the observation that those 3 functions that were wrapped, when they were being assigned to the p521_methods struct in crypto/fipsmodule/ec/p521.c#L292-L296, they were creating the following (commented) assembly instructions that were replaced by delocate.go by the ones right underneath in <build-folder>/crypto/fipsmodule/bcm-delocated.S.
.Lp521_methods_init_local_target:
p521_methods_init:
...
// WAS adrp x10, :got:bignum_add_p521
adr x10, .Lbignum_add_p521_local_target
// WAS adrp x9, :got:bignum_sub_p521
adr x9, .Lbignum_sub_p521_local_target
// WAS adrp x6, :got:bignum_neg_p521
adr x6, .Lbignum_neg_p521_local_target
In the case of gcc compilations (not all versions, it depends which ones get lucky), the target address of the adr instructions was further than the admissible 1MB.
The observation that other s2n-bignum functions assigned to the same method such as bignum_sqr_p521_alt was accessed via bignum_sqr_p521_selector which is defined in the header file
bcm-delocated.S:
// WAS adrp x7, bignum_sqr_p521_selector
adr x7, .Lbignum_sqr_p521_selector_local_target
and not too far from it
.align 2
.p2align 4,,11
.type bignum_sqr_p521_selector, %function
.Lbignum_sqr_p521_selector_local_target:
bignum_sqr_p521_selector:
.LFB423:
.cfi_startproc
// WAS adrp x2, :got:OPENSSL_armcap_P
sub sp, sp, 128
stp x0, x30, [sp, #-16]!
bl .LOPENSSL_armcap_P_addr
mov x2, x0
ldp x0, x30, [sp], #16
add sp, sp, 128
// WAS ldr x2, [x2, #:got_lo12:OPENSSL_armcap_P]
ldr w2, [x2]
tst w2, 28672
beq .L453
// WAS b bignum_sqr_p521_alt
b .Lbignum_sqr_p521_alt_local_target
the b (branch) instructions on the last line doesn’t suffer from the same range limit as the adr instruction; b has a range of of +/- 32MB, while adr has a range of +/- 1MB.
That's why I thought that wrapping the s2n-bignum function which may be far with a "nearby" static function may result in the same pattern as with bignum_sqr_p521_selector
Before this PR, i.e. adding the wrappers, the initialiser of p521_methods was compiled to get the address of, e.g., bignum_add_p521 from an unknown location and assign it to a register, but delocator.go found that label within the module and considered it local, but was unluckily too far to be treated like other local labels.
Adding a wrapper function caused the address of a "closer" function to be put in the address and a branch instruction to that target, which the delocator treated still as local but didn't need to change the b instruction because it doesn't cause relocation like adrp does.
|
This is probably expected, but I did try building this branch for x86-64 using gcc-14 and it's still failing as before: |
On AArch64, the delocator can patch up the computation of function pointers only if the pointers can be computed with a PC-relative offset in the range (-1MB, 1MB). For the function pointer computations in aes/mode_wrappers.c, this bounds condition is about to be violated by further code additions to AWS-LC, as witnessed in AES-unrelated PRs. This commit preventatively fixes the issue by adding function pointer trampolines to crypto/fipsmodule/aes/mode_wrappers.c: These are stub functions immediately branching into the desired assembly routines, but close enough to the C code computing their address to ensure that their addresses will be computable using a PC-relative offset. This fix is similar to aws#2165. Mid/Long-term, it should be considered whether the delocator can introduce the necessary indirections automatically. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
On AArch64, the delocator can patch up the computation of function pointers only if the pointers can be computed with a PC-relative offset in the range (-1MB, 1MB). For the function pointer computations in aes/mode_wrappers.c, this bounds condition is about to be violated by further code additions to AWS-LC, as witnessed in AES-unrelated PRs. This commit preventatively fixes the issue by adding function pointer trampolines to crypto/fipsmodule/aes/mode_wrappers.c: These are stub functions immediately branching into the desired assembly routines, but close enough to the C code computing their address to ensure that their addresses will be computable using a PC-relative offset. This fix is similar to aws#2165. Mid/Long-term, it should be considered whether the delocator can introduce the necessary indirections automatically. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
On AArch64, the delocator can patch up the computation of function pointers only if the pointers can be computed with a PC-relative offset in the range (-1MB, 1MB). For the function pointer computations in aes/mode_wrappers.c, this bounds condition is about to be violated by further code additions to AWS-LC, as witnessed in AES-unrelated PRs. This commit preventatively fixes the issue by adding function pointer trampolines to crypto/fipsmodule/aes/mode_wrappers.c: These are stub functions immediately branching into the desired assembly routines, but close enough to the C code computing their address to ensure that their addresses will be computable using a PC-relative offset. This fix is similar to aws#2165. Mid/Long-term, it should be considered whether the delocator can introduce the necessary indirections automatically. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
On AArch64, the delocator can patch up the computation of function pointers only if the pointers can be computed with a PC-relative offset in the range (-1MB, 1MB). For the function pointer computations in aes/mode_wrappers.c, this bounds condition is about to be violated by further code additions to AWS-LC, as witnessed in AES-unrelated PRs. This commit preventatively fixes the issue by adding function pointer trampolines to crypto/fipsmodule/aes/mode_wrappers.c: These are stub functions immediately branching into the desired assembly routines, but close enough to the C code computing their address to ensure that their addresses will be computable using a PC-relative offset. This fix is similar to aws#2165. Mid/Long-term, it should be considered whether the delocator can introduce the necessary indirections automatically. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
On AArch64, the delocator can patch up the computation of function pointers only if the pointers can be computed with a PC-relative offset in the range (-1MB, 1MB). For the function pointer computations in aes/mode_wrappers.c, this bounds condition is about to be violated by further code additions to AWS-LC, as witnessed in AES-unrelated PRs. This commit preventatively fixes the issue by adding function pointer trampolines to crypto/fipsmodule/aes/mode_wrappers.c: These are stub functions immediately branching into the desired assembly routines, but close enough to the C code computing their address to ensure that their addresses will be computable using a PC-relative offset. This fix is similar to aws#2165. Mid/Long-term, it should be considered whether the delocator can introduce the necessary indirections automatically. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license. -------- On AArch64, the delocator can patch up the computation of function pointers only if the pointers can be computed with a PC-relative offset in the range (-1MB, 1MB). For the function pointer computations in aes/mode_wrappers.c, this bounds condition is about to be violated by further code additions to AWS-LC, as witnessed in AES-unrelated PRs (specifically #2176). This commit preventatively fixes the issue by adding function pointer trampolines to crypto/fipsmodule/aes/mode_wrappers.c: These are stub functions immediately branching into the desired assembly routines, but close enough to the C code computing their address to ensure that their addresses will be computable using a PC-relative offset. This fix is similar to #2165. Mid/Long-term, it should be considered whether the delocator can introduce the necessary indirections automatically. Signed-off-by: Hanno Becker <beckphan@amazon.co.uk>
…#2920) ### Service Indicator: Add error call trampoline to avoid delocator issue ### Problem On AArch64, the delocator can patch up PC-relative addressing only if the references can be computed with a PC-relative offset in the range (-1MB, 1MB). For the `OPENSSL_PUT_ERROR` macro calls in `crypto/fipsmodule/service_indicator/service_indicator.c`, the `__FILE__` string literal references were exceeding the ARM64 ADR instruction's ±1MB range limit as the FIPS module grew larger. This manifested as build failures in ARM64 FIPS builds: ``` error: fixup value out of range adr x3, .L.str.192 ^ ``` The error originated from string literals being placed too far from their usage sites (e.g., string at line ~365493, first use at line ~2597 in bcm-delocated.S). ### Solution This commit fixes the issue by adding a non-inlined trampoline function for error calls: ```c #if defined(_MSC_VER) __declspec(noinline) #else __attribute__((noinline)) #endif static void put_set_thread_local_error(void) { OPENSSL_PUT_ERROR(CRYPTO, ERR_R_INTERNAL_ERROR); } ``` This ensures the `__FILE__` string literal stays close to its usage site and can be addressed using a PC-relative offset within the allowed range. ### Key Distinction from Function Pointer Trampolines This fix differs from previous function pointer trampolines (e.g., #2165), which wrapped assembly functions and used branch instructions to call them. Here, the pointer is not to a function but to a string literal (`__FILE__` from error macros), so instead we wrap the function that uses the pointer to keep the string reference close to its usage. The `noinline` directive is critical - without it, the compiler will inline the function, placing the `__FILE__` reference back at the original call sites and recreating the distance problem. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license. Co-authored-by: Justin W Smith <103147162+justsmth@users.noreply.github.com>
…2919) ### Delocate AES, GCM, and cipher wrapper functions On AArch64, the delocator can patch up the computation of function pointers only if the pointers can be computed with a PC-relative offset in the range (-1MB, 1MB). For the function pointer computations in `crypto/fipsmodule/aes/mode_wrappers.c`, `crypto/fipsmodule/cipher/e_aes.c`, and `crypto/fipsmodule/modes/gcm.c`, this bounds condition is about to be violated by further code additions to AWS-LC, as witnessed in AES-unrelated PRs. This commit preventatively fixes the issue by adding function pointer trampolines to these files: These are stub functions immediately branching into the desired assembly routines, but close enough to the C code computing their address to ensure that their addresses will be computable using a PC-relative offset. This fix is similar to previous delocator fixes addressing the same AArch64 PC-relative offset limitation, see #2165, #2294 for examples. ### AWS-LC-Verification As there are SAW proofs for AES GCM, these changes affect the proofs ([formal-verification / fv-saw-x86_64-aes-gcm (pull_request)](https://github.com/aws/aws-lc/actions/runs/20380843167/job/58570729683?pr=2919)) and require changes in aws-lc-verification to continue proof support -- this has been added in awslabs/aws-lc-verification#180. ### Testing: Stability of the fix was tested in #2903 which added ~10,000 lines of additional AVX2 backend. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.
Issues:
Addresses CryptoAlg-2899
Description of changes:
The original issue was caused by assigning s2n-bignum functions directly to the members of a
methodsstruct. This assignment usesadrpinstructions at the assembly level. The delocator (in FIPS static build) replacesadrpwithadrwhich has only +/- 1MB range of addressing. In some versions of gcc, the compiled code ends up exceeding that range between the instruction and its target address when it's in another assembly file such as s2n-bignum functions.Wrapping those pointers to s2n-bignum functions does the following
methodsstructs because they're in the same compilation unit.bbranch instruction to the target label. The branch instruction has an address range of +/- 32MB. (The caller of the method member function will be responsible for the link register, i.e. where the control flow will continue when the function returns.)Call-outs:
This suggests that we need to be careful not to assign external assembly functions directly to pointers (so that
adrpis not used with them to store their relative address in a register) and we need to ensure they are called from another function that is nearby by wrapping them.Testing:
On Graviton3, FIPS static build with gcc-11 which used to fail #2148 with
is passing with this change.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.