-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
Description
Ideally, this sort of thing should be done in two stages: first add APIs in 1.1.x for the world to switch to them, then remove the existing APIs in 1.2.0.
OpenSSL 1.1.0 opaquified structs, but did so in a way that does not help much in reducing complexity for future changes. Every object has an unnecessary empty state. For example, to construct an RSA key, one calls RSA_new and then fills in the components piecemeal. To hash something, one calls EVP_MD_CTX_new and then separately calls EVP_DigestInit_ex to initialize it.
This is bad for both users and OpenSSL. Users must initialize every object in two stages, both of which must be checked for failures. The library is more complex and thus more error-prone to use.
For OpenSSL, this adds a lot of complexity. Every object has a public "empty" state, which every function must now check for. Additionally, every property, even if it really only needs to be touched on construction, is mutable. This adds a lot of extra complexity.
For instance, EVP_MD_CTX_new and EVP_DigestInit_ex should have been combined into:
EVP_MD_CTX *EVP_MD_CTX_new_digest(const EVP_MD *type, ENGINE *engine);
This is how most libraries with heap-allocated hash contexts work. The split is a remnant of stack-allocated EVP_MD_CTXs, where there are separate allocation and construction stages. There is no reason to split them when heap-allocating because NULL is already a zero state.
In this API, EVP_DigestUpdate does not need to worry about the callers passing in an empty EVP_MD_CTX because it is impossible to vend such a thing. Additionally, EVP_DigestInit_ex does not need to worry about resizing the hash context when the hash changes, because that's impossible. The caller would just make a new EVP_MD_CTX.
(It even allows a small optimization. One could allocate the EVP_MD_CTX and hash context in a single allocation.)
On the RSA side, this is especially pronounced. Beyond the problems above of (multiple!) partially-constructed states, there is no easy point for an RSA object to "finish" initializing. RSA objects have a pre-computed BN_MONT_CTX. Right now, these are filled in lazily via BN_MONT_CTX_set_locked. That makes the implementation more complex and carries a confusing API contract for the user: RSA objects are mutable, but you should not call those functions after you've used it, because we'll set up caches and get the answer wrong. (One could imagine making the setters drop those caches, but that's even more corner cases for a use case that does not actually exist; it is simpler to just make a new RSA object.)
What OpenSSL 1.1.0 should have done is not added any setters at all. Instead, add functions like RSA_new_public and RSA_new_private that takes all the relevant parameters. After construction, make RSA objects immutable.
This sort of pattern repeats throughout the library. EVP_PKEY functions must account for empty EVP_PKEYs because of the empty EVP_PKEY_new function. EC_KEY must account for the private key being set before the group, which complicates bounds-checking. BN_MONT_CTX has an unnecessary empty state. Custom EC_GROUPs (though those should be dropped in favor of named-only) have an empty state, a curve-without-generator state, and a generator-without-curve-state.
Aside: To fully explore the tradeoff, the downside of bundling things up in the constructor is that it's less easily extensible to more parameters. Adding, say, multi-prime support to RSA_new_private would have required a separate RSA_new_private_multiprime function. For crypto primitives, this is the right tradeoff. They tend to be logically immutable on construction, with rarely, if ever, changing specifications. (RSA is still RSA.)
In contrast, the tower of setters makes more sense SSL_CTX, though I will note it has consequences like #5128 being unsuitable for merging. One could imagine using a builder pattern here, but it's perhaps not worth the trouble.