Skip to content

Conversation

@levitte
Copy link
Member

@levitte levitte commented Dec 10, 2020

DECODER EVP_PKEY: Don't store all the EVP_KEYMGMTs

OSSL_DECODER_CTX_new_by_EVP_PKEY() would keep copies of all the
EVP_KEYMGMTs it finds.
This turns out to be fragile in certain circumstances, so we switch to
fetch the appropriate EVP_KEYMGMT when it's time to construct an
EVP_PKEY from the decoded data instead. This has the added benefit
that we now actually use the property query string that was given by
the caller for these fetches.

Fixes #13503

MSBLOB & PVK: Make it possible to write EVP_PKEYs with provided internal key

So far, the MSBLOB and PVK writers could only handle EVP_PKEYs with
legacy internal keys.

Specially to be able to compile the loader_attic engine, we use the C
macro OPENSSL_NO_PROVIDER_CODE to avoid building the provider specific
things when we don't need them. The alternative is to suck half of
crypto/evp/ into loader_attic, and that's just not feasible.

Fixes #13503

DECODER: Adjust the library context of keys in our decoders

Because decoders are coupled with keymgmts from the same provider,
ours need to produce provider side keys the same way. Since our
keymgmts create key data with the provider library context, so must
our decoders.

We solve with functions to adjust the library context of decoded keys,
and use them.

CORE: Separate OSSL_PROVIDER activation from OSSL_PROVIDER reference

This introduces a separate activation counter, and the function
ossl_provider_deactivate() for provider deactivation.

Something to be noted is that if the reference count goes down to
zero, we don't care if the activation count is non-zero (i.e. someone
forgot to call ossl_provider_deactivate()). Since there are no more
references to the provider, it doesn't matter.
The important thing is that deactivation doesn't remove the provider
as long as there are references to it, for example because there are
live methods associated with that provider, but still makes the
provider unavailable to create new methods from.

Fixes #13503
Fixes #12157

EVP: Fix memory leak in EVP_PKEY_CTX_dup()

In most error cases, EVP_PKEY_CTX_dup() would only free the EVP_PKEY_CTX
without freeing the duplicated contents.

Fixes #13503

@levitte levitte added branch: master Applies to master branch approval: review pending This pull request needs review by a committer severity: urgent Fixes an urgent issue (exempt from 24h grace period) labels Dec 10, 2020
@levitte levitte added this to the 3.0.0 beta1 milestone Dec 10, 2020
@levitte
Copy link
Member Author

levitte commented Dec 10, 2020

I'm getting issues with the tests on my laptop, like this:

        # INFO:  @ ../master/test/evp_test.c:3166
        # ../../../master/test/recipes/30-test_evp_data/evppkey_dsa.txt:269: Expected SIGNATURE_MISMATCH got DIGESTSIGNFINAL_ERROR
        # 804BC250547F0000:error:0300009C:digital envelope routines:inner_evp_generic_fetch:unsupported algorithm:../master/crypto/evp/evp_fetch.c:345:Global default library context, Algorithm (SHA512 : 150), Properties (<null>)
        # 804BC250547F0000:error:01800078:bignum routines:BN_generate_dsa_nonce:no suitable digest:../master/crypto/bn/bn_rand.c:279:
        # 804BC250547F0000:error:05080003:dsa routines:dsa_sign_setup:BN lib:../master/crypto/dsa/dsa_ossl.c:310:
        # 804BC250547F0000:error:05080003:dsa routines:dsa_do_sign_int:BN lib:../master/crypto/dsa/dsa_ossl.c:177:
        ...
        not ok 1 - ../../../master/test/recipes/30-test_evp_data/evppkey_dsa.txt
    not ok 1 - run_file_tests
../../util/wrap.pl ../../test/evp_test -config ../../../master/test/default-and-legacy.cnf ../../../master/test/recipes/30-test_evp_data/evppkey_dsa.txt => 1
not ok 21 - running evp_test -config ../../../master/test/default-and-legacy.cnf evppkey_dsa.txt
#   Failed test 'running evp_test -config ../../../master/test/default-and-legacy.cnf evppkey_dsa.txt'
#   at ../master/test/recipes/30-test_evp.t line 130.

This makes me think of the recent issues with seed stuff, but I thought @paulidale had fixed it and the fix was merged. I have no idead what's going on...

@levitte
Copy link
Member Author

levitte commented Dec 11, 2020

I have no idea what's going on...

I got a better idea now, and it has nothing to do with the seed stuff.

Our decoders must adjust the library context of the created keys to match what would come out of the KEYMGMT if it was generated.

@levitte
Copy link
Member Author

levitte commented Dec 11, 2020

Sanitizers still not happy... that looks like a leak elsewhere, though.

@levitte
Copy link
Member Author

levitte commented Dec 11, 2020

This is becoming a generic fixup PR... I'll change title and description accordingly

@levitte levitte changed the title DECODER EVP_PKEY: Don't store all the EVP_KEYMGMTs + PVK/MSBLOB handling of provided EVP_PKEYs [URGENT] Fix 13503 Dec 11, 2020
@t8m
Copy link
Member

t8m commented Dec 11, 2020

Sanitizers still not happy... that looks like a leak elsewhere, though.

But the leak or at least its appearance seems to be triggered by this PR.

@levitte levitte changed the title [URGENT] Fix 13503 [URGENT] Fix sanitizer build Dec 11, 2020
@levitte
Copy link
Member Author

levitte commented Dec 11, 2020

But the leak or at least its appearance seems to be triggered by this PR.

Yeah, something's up, I agree.

@levitte
Copy link
Member Author

levitte commented Dec 11, 2020

Found 1 leak, I hope that's it.

@levitte
Copy link
Member Author

levitte commented Dec 11, 2020

I did a sanitizer run on my laptop, using the exact same setup (apart from no-asm) as the sanitizer build, and it passed. I do see that it didn't quite succeed on Github CI, there's a failure in test/evp_extra_test2.c, which I can't reproduce for the moment.

I'd be grateful if anyone could take a look at that last bit and help me figure it out

UPDATE: Maybe it's a no-asm problem?

@mattcaswell
Copy link
Member

I'd be grateful if anyone could take a look at that last bit and help me figure it out

I can periodically reproduce it with OPENSSL_TEST_RAND_ORDER=0

@mattcaswell
Copy link
Member

With this patch I can reliably reproduce this failure even in a build without sanitizers:

diff --git a/test/evp_extra_test2.c b/test/evp_extra_test2.c
index 9181061247..a78d5059fe 100644
--- a/test/evp_extra_test2.c
+++ b/test/evp_extra_test2.c
@@ -278,9 +278,9 @@ int setup_tests(void)
         return 0;
     }
 
+    ADD_TEST(test_d2i_PrivateKey_ex);
     ADD_TEST(test_alternative_default);
     ADD_ALL_TESTS(test_d2i_AutoPrivateKey_ex, OSSL_NELEM(keydata));
-    ADD_TEST(test_d2i_PrivateKey_ex);
 
     return 1;
 }

It looks to me like an instance of #12157

@levitte
Copy link
Member Author

levitte commented Dec 11, 2020

Thanks for the hint, now I can go in and see what can be done about it.

@paulidale
Copy link
Contributor

paulidale commented Dec 12, 2020

There is no need for trial and error over the test order randomisation.

When running with OPENSSL_TEST_RAND_ORDER=0, a test failure will print "random seed = nnnnnnn".
Rerunning with OPENSSL_TEST_RAND_ORDER=nnnnnnn will execute the tests in the exact same order.
This should remain true even across platforms and operating systems.

I've created #13672 to make the randomisation failure text more obvious and to document how to use the env variable.

@levitte levitte force-pushed the fix-13503 branch 2 times, most recently from ec054ce to bfbe5bf Compare December 16, 2020 14:36
@levitte
Copy link
Member Author

levitte commented Dec 16, 2020

A new commit was added, which fixes #12157, and hopefully make the sanitizers happy.

("It works on my laptop!" -- famous last words)

@levitte
Copy link
Member Author

levitte commented Dec 16, 2020

This is urgent, so it would be nice to get a review soon.

@t8m
Copy link
Member

t8m commented Dec 16, 2020

It seems the old failures in the sanitizer are now fixed. There is a new one - leak in the provided test. However this is not a regression caused by this PR as the new failure is there on master already. I did not yet find out which merged PR caused it.

@levitte
Copy link
Member Author

levitte commented Dec 16, 2020

It seems the old failures in the sanitizer are now fixed. There is a new one - leak in the provided test. However this is not a regression caused by this PR as the new failure is there on master already. I did not yet find out which merged PR caused it.

I found that one and added a commit that I hope fixes the issue.

@levitte
Copy link
Member Author

levitte commented Dec 16, 2020

Now it works on my laptop.

@levitte
Copy link
Member Author

levitte commented Dec 16, 2020

Yay, sanitizer CI passes 😄

@t8m
Copy link
Member

t8m commented Dec 17, 2020

Working on review but it will take me some time as the changes are quite non-trivial.

@levitte
Copy link
Member Author

levitte commented Dec 17, 2020

Working on review but it will take me some time as the changes are quite non-trivial.

I'd recommend reading them one commit at a time, that will at least make each change more concise.

Oh, and thank you!

@levitte
Copy link
Member Author

levitte commented Dec 17, 2020

I'd recommend reading them one commit at a time, that will at least make each change more concise.

With that in mind, should I perform a squash?

@t8m
Copy link
Member

t8m commented Dec 17, 2020

I'd recommend reading them one commit at a time, that will at least make each change more concise.

With that in mind, should I perform a squash?

Yes, please.

OSSL_DECODER_CTX_new_by_EVP_PKEY() would keep copies of all the
EVP_KEYMGMTs it finds.
This turns out to be fragile in certain circumstances, so we switch to
fetch the appropriate EVP_KEYMGMT when it's time to construct an
EVP_PKEY from the decoded data instead.  This has the added benefit
that we now actually use the property query string that was given by
the caller for these fetches.

Fixes openssl#13503
…nal key

So far, the MSBLOB and PVK writers could only handle EVP_PKEYs with
legacy internal keys.

Specially to be able to compile the loader_attic engine, we use the C
macro OPENSSL_NO_PROVIDER_CODE to avoid building the provider specific
things when we don't need them.  The alternative is to suck half of
crypto/evp/ into loader_attic, and that's just not feasible.

Fixes openssl#13503
Because decoders are coupled with keymgmts from the same provider,
ours need to produce provider side keys the same way.  Since our
keymgmts create key data with the provider library context, so must
our decoders.

We solve with functions to adjust the library context of decoded keys,
and use them.
This introduces a separate activation counter, and the function
ossl_provider_deactivate() for provider deactivation.

Something to be noted is that if the reference count goes down to
zero, we don't care if the activation count is non-zero (i.e. someone
forgot to call ossl_provider_deactivate()).  Since there are no more
references to the provider, it doesn't matter.
The important thing is that deactivation doesn't remove the provider
as long as there are references to it, for example because there are
live methods associated with that provider, but still makes the
provider unavailable to create new methods from.

Fixes openssl#13503
Fixes openssl#12157
In most error cases, EVP_PKEY_CTX_dup() would only free the EVP_PKEY_CTX
without freeing the duplicated contents.

Fixes openssl#13503
@levitte
Copy link
Member Author

levitte commented Dec 17, 2020

With that in mind, should I perform a squash?

Yes, please.

Done

Copy link
Member

@t8m t8m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some doc nits, otherwise LGTM.

@t8m
Copy link
Member

t8m commented Dec 17, 2020

Approved. Setting ready-to-merge as this is urgent.

@t8m t8m added approval: ready to merge The 24 hour grace period has passed, ready to merge and removed approval: review pending This pull request needs review by a committer labels Dec 17, 2020
@levitte
Copy link
Member Author

levitte commented Dec 17, 2020

Setting ready-to-merge as this is urgent.

Thanks. Will merge now

openssl-machine pushed a commit that referenced this pull request Dec 17, 2020
OSSL_DECODER_CTX_new_by_EVP_PKEY() would keep copies of all the
EVP_KEYMGMTs it finds.
This turns out to be fragile in certain circumstances, so we switch to
fetch the appropriate EVP_KEYMGMT when it's time to construct an
EVP_PKEY from the decoded data instead.  This has the added benefit
that we now actually use the property query string that was given by
the caller for these fetches.

Fixes #13503

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from #13661)
openssl-machine pushed a commit that referenced this pull request Dec 17, 2020
…nal key

So far, the MSBLOB and PVK writers could only handle EVP_PKEYs with
legacy internal keys.

Specially to be able to compile the loader_attic engine, we use the C
macro OPENSSL_NO_PROVIDER_CODE to avoid building the provider specific
things when we don't need them.  The alternative is to suck half of
crypto/evp/ into loader_attic, and that's just not feasible.

Fixes #13503

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from #13661)
openssl-machine pushed a commit that referenced this pull request Dec 17, 2020
Because decoders are coupled with keymgmts from the same provider,
ours need to produce provider side keys the same way.  Since our
keymgmts create key data with the provider library context, so must
our decoders.

We solve with functions to adjust the library context of decoded keys,
and use them.

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from #13661)
openssl-machine pushed a commit that referenced this pull request Dec 17, 2020
This introduces a separate activation counter, and the function
ossl_provider_deactivate() for provider deactivation.

Something to be noted is that if the reference count goes down to
zero, we don't care if the activation count is non-zero (i.e. someone
forgot to call ossl_provider_deactivate()).  Since there are no more
references to the provider, it doesn't matter.
The important thing is that deactivation doesn't remove the provider
as long as there are references to it, for example because there are
live methods associated with that provider, but still makes the
provider unavailable to create new methods from.

Fixes #13503
Fixes #12157

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from #13661)
openssl-machine pushed a commit that referenced this pull request Dec 17, 2020
In most error cases, EVP_PKEY_CTX_dup() would only free the EVP_PKEY_CTX
without freeing the duplicated contents.

Fixes #13503

Reviewed-by: Tomas Mraz <tmraz@fedoraproject.org>
(Merged from #13661)
@levitte
Copy link
Member Author

levitte commented Dec 17, 2020

Merged

054cde1 DECODER EVP_PKEY: Don't store all the EVP_KEYMGMTs
e77c13f MSBLOB & PVK: Make it possible to write EVP_PKEYs with provided internal key
6963979 DECODER: Adjust the library context of keys in our decoders
390f9ba CORE: Separate OSSL_PROVIDER activation from OSSL_PROVIDER reference
74cd923 EVP: Fix memory leak in EVP_PKEY_CTX_dup()

@levitte levitte closed this Dec 17, 2020
@levitte levitte deleted the fix-13503 branch December 17, 2020 11:04
@levitte
Copy link
Member Author

levitte commented Dec 17, 2020

Thank you, @t8m!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approval: ready to merge The 24 hour grace period has passed, ready to merge branch: master Applies to master branch severity: urgent Fixes an urgent issue (exempt from 24h grace period)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

build fails in test_encoder_decoder with OPENSSL_TEST_RAND_ORDER=0 It's not possible to unload a provider once its been used

4 participants