-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Fix memfail asan issues #28195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memfail asan issues #28195
Conversation
The new malloc failure test caught an asan error in this code: Direct leak of 40 byte(s) in 1 object(s) allocated from: 2025-08-07T03:22:20.3655117Z #0 0x7fb88d8fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 2025-08-07T03:22:20.3655796Z #1 0x5584f0e4670a in CRYPTO_malloc crypto/mem.c:211 2025-08-07T03:22:20.3656291Z #2 0x5584f0e4679d in CRYPTO_zalloc crypto/mem.c:231 2025-08-07T03:22:20.3657040Z #3 0x5584f11c4c10 in EVP_RAND_CTX_new crypto/evp/evp_rand.c:353 2025-08-07T03:22:20.3657656Z openssl#4 0x5584f0e93b27 in rand_new_drbg crypto/rand/rand_lib.c:666 2025-08-07T03:22:20.3658289Z openssl#5 0x5584f0e949d0 in rand_get0_public crypto/rand/rand_lib.c:843 2025-08-07T03:22:20.3658914Z openssl#6 0x5584f0e9305b in RAND_bytes_ex crypto/rand/rand_lib.c:490 2025-08-07T03:22:20.3659486Z openssl#7 0x5584f0b2405f in SSL_CTX_new_ex ssl/ssl_lib.c:4191 2025-08-07T03:22:20.3660183Z openssl#8 0x5584f0ae313c in create_ssl_ctx_pair test/helpers/ssltestlib.c:958 2025-08-07T03:22:20.3660871Z openssl#9 0x5584f0adeaf6 in do_handshake test/handshake-memfail.c:56 2025-08-07T03:22:20.3661539Z openssl#10 0x5584f0adee50 in test_alloc_failures test/handshake-memfail.c:125 2025-08-07T03:22:20.3662161Z openssl#11 0x5584f0cd9da8 in run_tests test/testutil/driver.c:342 2025-08-07T03:22:20.3662664Z openssl#12 0x5584f0cda9e5 in main test/testutil/main.c:31 2025-08-07T03:22:20.3663450Z openssl#13 0x7fb88d42a1c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 282c2c16e7b6600b0b22ea0c99010d2795752b5f) 2025-08-07T03:22:20.3664630Z openssl#14 0x7fb88d42a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 282c2c16e7b6600b0b22ea0c99010d2795752b5f) 2025-08-07T03:22:20.3666608Z openssl#15 0x5584f0ade864 in _start (/home/runner/work/openssl/openssl/test/handshake-memfail+0x22a864) (BuildId: 19659a44d8bed2c082918d25425f77e3a98df534) It occurs because when rand_get0_public/rand_get0_private sets an EVP_RAND_CTX object in its thread local storage, it neglects to check the return code of the operation, which may fail when the associated sparse array is expanded. fix it by checking the return code and failing the get0_[public|private] operation so the failure is graceful. Fixes openssl/project#1315
during memfail testing: https://github.com/openssl/openssl/actions/runs/16794088536/job/47561223902 We get lots of test failures in ossl_rcu_read_lock. This occurs because we have a few cases in the read lock path that attempt mallocs, which, if they fail, trigger an assert or a silent failure, which isn't really appropriate. We should instead fail gracefully, by informing the caller that the lock failed, like we do for CRYPTO_THREAD_read_lock. Fortunately, these are all internal apis, so we can convert ossl_rcu_read_lock to return an int indicating success/failure, and fail gracefully during the test, rather than hitting an assert abort. Fixes openssl/project#1315
393062b to
aa70a06
Compare
beldmit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| case DO_LOOKUP: | ||
| ossl_ht_read_lock(m_ht); | ||
| if (!ossl_ht_read_lock(m_ht)) | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want to fail here? i.e. set worker_exits[num] and return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could, but we don't have to. The lookups in this test just find an element in the hash table and confirm that its supposed to be there. Failure to get a lock in this case just represents a transient failure that doesn't impact the outcome of what the test is actually testing. It would be a problem if we could never get the lock for the duration of the test, that might be a problem, as it means we never tested a lookup operation. But a single failure isn't really a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this ever fail? Are there legitimate reasons why it might do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
other than a transient malloc failure outside of the memfail test, no, I can't think of any. But a transient memory allocation failure on a heavily loaded system seems like it might be possible.
|
This pull request is ready to merge |
|
merged to master, thank you! |
The new malloc failure test caught an asan error in this code: Direct leak of 40 byte(s) in 1 object(s) allocated from: 2025-08-07T03:22:20.3655117Z #0 0x7fb88d8fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69 2025-08-07T03:22:20.3655796Z #1 0x5584f0e4670a in CRYPTO_malloc crypto/mem.c:211 2025-08-07T03:22:20.3656291Z #2 0x5584f0e4679d in CRYPTO_zalloc crypto/mem.c:231 2025-08-07T03:22:20.3657040Z #3 0x5584f11c4c10 in EVP_RAND_CTX_new crypto/evp/evp_rand.c:353 2025-08-07T03:22:20.3657656Z #4 0x5584f0e93b27 in rand_new_drbg crypto/rand/rand_lib.c:666 2025-08-07T03:22:20.3658289Z #5 0x5584f0e949d0 in rand_get0_public crypto/rand/rand_lib.c:843 2025-08-07T03:22:20.3658914Z #6 0x5584f0e9305b in RAND_bytes_ex crypto/rand/rand_lib.c:490 2025-08-07T03:22:20.3659486Z #7 0x5584f0b2405f in SSL_CTX_new_ex ssl/ssl_lib.c:4191 2025-08-07T03:22:20.3660183Z #8 0x5584f0ae313c in create_ssl_ctx_pair test/helpers/ssltestlib.c:958 2025-08-07T03:22:20.3660871Z #9 0x5584f0adeaf6 in do_handshake test/handshake-memfail.c:56 2025-08-07T03:22:20.3661539Z #10 0x5584f0adee50 in test_alloc_failures test/handshake-memfail.c:125 2025-08-07T03:22:20.3662161Z #11 0x5584f0cd9da8 in run_tests test/testutil/driver.c:342 2025-08-07T03:22:20.3662664Z #12 0x5584f0cda9e5 in main test/testutil/main.c:31 2025-08-07T03:22:20.3663450Z #13 0x7fb88d42a1c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 282c2c16e7b6600b0b22ea0c99010d2795752b5f) 2025-08-07T03:22:20.3664630Z #14 0x7fb88d42a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 282c2c16e7b6600b0b22ea0c99010d2795752b5f) 2025-08-07T03:22:20.3666608Z #15 0x5584f0ade864 in _start (/home/runner/work/openssl/openssl/test/handshake-memfail+0x22a864) (BuildId: 19659a44d8bed2c082918d25425f77e3a98df534) It occurs because when rand_get0_public/rand_get0_private sets an EVP_RAND_CTX object in its thread local storage, it neglects to check the return code of the operation, which may fail when the associated sparse array is expanded. fix it by checking the return code and failing the get0_[public|private] operation so the failure is graceful. Fixes openssl/project#1315 Reviewed-by: Paul Yang <paulyang.inf@gmail.com> Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com> Reviewed-by: Paul Dale <ppzgs1@gmail.com> (Merged from #28195)
during memfail testing: https://github.com/openssl/openssl/actions/runs/16794088536/job/47561223902 We get lots of test failures in ossl_rcu_read_lock. This occurs because we have a few cases in the read lock path that attempt mallocs, which, if they fail, trigger an assert or a silent failure, which isn't really appropriate. We should instead fail gracefully, by informing the caller that the lock failed, like we do for CRYPTO_THREAD_read_lock. Fortunately, these are all internal apis, so we can convert ossl_rcu_read_lock to return an int indicating success/failure, and fail gracefully during the test, rather than hitting an assert abort. Fixes openssl/project#1315 Reviewed-by: Paul Yang <paulyang.inf@gmail.com> Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com> Reviewed-by: Paul Dale <ppzgs1@gmail.com> (Merged from #28195)
The newly added handshake-memfail test has flagged several asan and assert errors that arise from various memory failure conditions. This PR addresses those so that the test fails gracefully instead of crashing.