OCSP stapling support for Netty using netty-tcnative.#6158
OCSP stapling support for Netty using netty-tcnative.#6158
Conversation
85dca5a to
40724c8
Compare
|
Build fails because this change depends on a version of netty-tcnative that isn't released yet: netty/netty-tcnative#215 I was planning to include also an example for how to pull out the OCSP responder URL from the X509 Certificate and retrieve the OCSP staple from the CA but it'll require Bouncycastle. |
There was a problem hiding this comment.
Maybe we should add this to the style guide but can you keep the statics at the bottom, and preserve the previous import order?
There was a problem hiding this comment.
should 0 be TLSEXT_STATUSTYPE_host_name?
There was a problem hiding this comment.
SSL.setStatusType protects against null SSL* too ... can we just rely on the null there instead of doing it in both places?
There was a problem hiding this comment.
should we instead unwrap and verify the NPE?
There was a problem hiding this comment.
can we run this test with OPENSSL_REFCNT too? We need to be careful about releasing in this case of course.
There was a problem hiding this comment.
consider specifying the timeout values here so the unit tests don't have to wait for the quite period etc...
There was a problem hiding this comment.
this is expected to be called in the junit thread right, if so can we just do assertTrue?
40724c8 to
f398f75
Compare
There was a problem hiding this comment.
This test and testServerException produce the NPE described in #6336.
OpenSSL calls the verify() method (via the OCSP callback - see PR in netty-tcnative) and this test throws purposely an IllegalStateException to see what happens.
I'm expecting a DecoderException to bubble up (with the IllegalStateException as cause) but instead I'm getting DecoderException that was caused by a NPE.
5bc87a1 to
f51de8e
Compare
bec5a7f to
8dde6fb
Compare
|
@Scottmitch I've added some example code to the PR. The client side is pretty simple but likely not very useful. The server side is more interesting as it involves pulling out the CA's OCSP responder URL from the X509 Certificate and then constructing a special POST request to obtain the OCSP staple from the said URL. It's a somewhat involved process and there isn't much information about it on the Intertubes. I tried to keep the examples as simple as possible and they're not meant to be an example how do run this code in production. I used Wikipedia's Webserver for the client example because the CDN that is fronting netty.io doesn't support OCSP. And I used netty.io's X509 Certificate (Comodo) for the server example but it's incomplete because I don't have the PrivateKey for it. |
Scottmitch
left a comment
There was a problem hiding this comment.
few comments but looking good!
I didn't look too closely at OcspServerExample. It would be nice if we could make this work with the client example ... not sure how much additional effort this would require though.
There was a problem hiding this comment.
see netty/netty-tcnative#215 ... does this account for checking of the runtime version which may be dynamically linked?
There was a problem hiding this comment.
is it possible to add the commands required to generate this to https://github.com/netty/netty/blob/4.1/handler/src/test/resources/io/netty/handler/ssl/generate-certs.sh ?
There was a problem hiding this comment.
That cert chain is actually not generated. It's the real cert chain for https://netty.io and it was simply pulled with Chrome's export feature.
There was a problem hiding this comment.
add a comment to explain null is allowed and that it will free the existing callback if one exists.
There was a problem hiding this comment.
nit: you mix 2 and 4 space indentation here ... can you make this consistent?
There was a problem hiding this comment.
consider adding a descriptive string: e.g socket was closed before response was received.
There was a problem hiding this comment.
consider adding a descriptive error string for the exception
There was a problem hiding this comment.
also aren't you randomly generating the nonce on each build?
There was a problem hiding this comment.
Yes, a new nonce is created for each build(). The comment is referring to the built object itself because it's tempting to cache that OCSPReq object.
There was a problem hiding this comment.
make a define for this value?
There was a problem hiding this comment.
😮
Consider adding a comment // TODO: use Netty 😉
2fc4adf to
9405451
Compare
There was a problem hiding this comment.
can you link to the RFC or area where this ID is define? these magic IDs have a tendency to spread around without any reference to where they came from.
There was a problem hiding this comment.
Can you clarify why having a self signed certificate is prohibitive and why it wouldn't be possible for the server to simulate querying the CA for the staple ... and just generate it on demand (via bc, openssl, etc...)?
There was a problem hiding this comment.
there is a race condition here where callback may be used after freeOcspCallback is called on the callback. Consider the following:
- adding a constraint (via javadoc APIs) on
OcspCallbackExtthat this use case must be supported (no negative behavior will result from callingcallback.context()or using the resulting object afterfreeOcspCallbackis called). - adding a warning on this method that unexpected results may occur if the underlying object is freed.
- remove this method (doesn't look like its used currently).
There was a problem hiding this comment.
IMO it would be more clear if the 0 value came from tcnative instead of implicitly defined at the Netty level.
Thinking about it more TlsExtStatusType is not very descriptive and I'm not sure if it would require state in the future to use if different options are supported through this interface. I get why BoringSSL decided to go with SSL_enable_ocsp_stapling ... can we just use this API naming convention in tcnative instead of OpenSSL's? @normanmaurer - WDYT?
There was a problem hiding this comment.
Ack. Happy to change it into enableOcsp() as there is no use-case for the toggle functionality.
There was a problem hiding this comment.
consider saving a reference to this exception ... just to ensure the same value is used in the comparison and string printout.
(general comment for this test)
ea41995 to
0f41465
Compare
There was a problem hiding this comment.
I think we should revert this back to a plain JDK implementation (i.e. HttpURLConnection) and provide it just a short and simple example how to interact with the CA's responder as there's not much information about it on the Internet (or rather it's difficult to formulate the query as OCSP is foreign to many ppl.)
Once you know that it's a HTTP POST, with special content-type and accept headers and a special payload it's "trivial" to deduct a Netty client from it.
In general, in order to operate OCSP you need to build some infrastructure that goes beyond the Netty project. You need caches, an OCSP responder client (i.e. something like this class) and a strategy to manage and refresh all these staples that are only valid for < 10 days.
There was a problem hiding this comment.
I am fine with that for the sake of an example, and assuming we can't or its too much hassle to implement a more self contained example #6158 (comment). My previous comment on this was meant to literally add something like // TODO: we use HttpURLConnection for the sake of demonstration, but you should consider using Netty's HTTP client instead.
There was a problem hiding this comment.
Hum this relationship is unnatural. Typically the SSLContext provides the default configuration for creating an SSLEngine and the same relationship olds for SSL_CTX and SSL, and the SSL/SSLEngine never impacts the state of the SSL_CTX/SSLContext. Could we just instead have tcnative throw if the SSL_CTX has not been setup correctly? For example we could add a bool oscp_enabled to tcn_ssl_ctxt_t (in ssl_private.h). SSLContext#enableOcsp would set this state (even for BoringSSL) and then SSL#enableOcsp would check this state has been set. You can get the tcn_ssl_ctxt_t from the SSL* via SSL_get_app_data2.
There was a problem hiding this comment.
Agree. Next PR fixes it. I've reduced it to:
SSLContext#enableOcsp
SSL#setOcspResponse
SSL#getOcspResponse
The SSL#enableOcsp call doesn't need to be exposed to the user.
There was a problem hiding this comment.
nit: make a tertiary statement.
There was a problem hiding this comment.
can this just be removed before the verification?
|
Sorry @Scottmitch @rkapsi but I had no time yet to look at this... Will try to do over the next days |
There was a problem hiding this comment.
@rkapsi is this good enough and also hold true for Libressl and Boringssl ?
There was a problem hiding this comment.
TBH I don't know. I'm operating under the assumption they're forks of OpenSSL and OCSP should be present as of 1.0.0 and we have only tested 1.0.2+.
There was a problem hiding this comment.
maybe try to run the test while using "netty-tcnative-boringssl-static" and see what happens :)
There was a problem hiding this comment.
OK, LibreSSL works.
BoringSSL doesn't work (fails gracefully with my OCSP is not supported exception) but that is because... It's either getting built with OPENSSL_NO_OCSP or OPENSSL_VERSION_NUMBER is something very different than expected. Checking... 🤔
There was a problem hiding this comment.
Confirm, BoringSSL gets built with OPENSSL_NO_OCSP.
There was a problem hiding this comment.
Thanks for confirming @davidben.
@nmittler - We are working with @rkapsi to support both BoringSSL and OpenSSL (and LibreSSL should come for "free"). We are currently using both APIs to make this happen and it is just a matter of working out the kinks to make sure we can make the two different APIs (BoringSSL and OpenSSL) work consistently with one API (tcnative/netty) on top. I think we are very close to pulling this off (thanks to @rkapsi !)
There was a problem hiding this comment.
In regards to the machinery: That is totally cool that BoringSSL doesn't support the heavy machinery because we don't use any of OpenSSL's either. All we want to do is blindly pass along byte strings. It's literally just that and BoringSSL's API does just that. Again, totally cool. Unfortunately it's defaulting to OPENSSL_NO_OCSP and I guess we're confused how to proceed. 🤔
I'm personally cool not supporting BoringSSL but that means I'll never be able to use it either because our lovely non-profit CA will not appreciate it being DDoS'd if our servers do handshakes without staples.
There was a problem hiding this comment.
OPENSSL_NO_OCSP means we do not support OpenSSL's OCSP API. If we did not define it, loads of projects would fail to build because they, naturally, do not use BoringSSL's API. Like with any other project, your options are not to support OCSP stapling with BoringSSL (in which case OPENSSL_NO_OCSP, what you would have been checking anyway, Just Works and you need no additional work) or to add BoringSSL-specific code that does not look at OPENSSL_NO_OCSP. BoringSSL-specific code is obviously needed anyway for BoringSSL-specific API, so I don't see why having to ignore a preprocessor symbol which happens to have a similar name as the feature matters.
There was a problem hiding this comment.
In case this is where the confusion comes from, BoringSSL's APIs has no relation to OPENSSL_NO_OCSP. They are available regardless.
We do not, in general, have individual toggles for all our features like OpenSSL does. Instead, we define a set of OPENSSL_NO_* values based on what makes portability simplest. These symbols do not affect what features BoringSSL exposes.
There was a problem hiding this comment.
Agreed I think we have the tools we need and everything is ok :)
@rkapsi we just need to adjust the macros see netty/netty-tcnative#215 (comment) ... I will add more details here.
There was a problem hiding this comment.
protected as the class is abstract
There was a problem hiding this comment.
Making it protected would prevent ad-hoc instantiation and force the user to declare a class which can be a nuisance for prototyping/quick hackery.
// This won't be possible with a protected ctor
pipeline.addLast(new OcspClientHandler(engine) {
@Override
public boolean verify(...) {
return true;
}
});There was a problem hiding this comment.
@rkapsi - are you sure? I don't see why this would cause an issue, can you explain? OcspClientHandler is an anonymous type that should have access to protected things in the super class.
There was a problem hiding this comment.
Operator error, disregard me.
There was a problem hiding this comment.
consider using a do {} while(...) loop as it should not be called with a null Throwable
pom.xml
Outdated
There was a problem hiding this comment.
2.0.1.Final-SNAPSHOT now :)
cfaf2e2 to
d80bf08
Compare
Scottmitch
left a comment
There was a problem hiding this comment.
I like the new approach :) few more comments too
There was a problem hiding this comment.
should we treat this as an error if called while in the wrong mode?
There was a problem hiding this comment.
@rkapsi - are you sure? I don't see why this would cause an issue, can you explain? OcspClientHandler is an anonymous type that should have access to protected things in the super class.
There was a problem hiding this comment.
consider testing with both providers (OPENSSL_REFCNT) like other tests.
There was a problem hiding this comment.
consider testing with both providers (OPENSSL_REFCNT) like other tests.
There was a problem hiding this comment.
consider blocking on the channel close futures. If the event loop is closed before the channels close we could leak buffers (close may execute a task on the event loop to cleanup internal queues).
Scottmitch
left a comment
There was a problem hiding this comment.
few remaining comments the lgtm ... great job @rkapsi !
pom.xml
Outdated
There was a problem hiding this comment.
revert this after you are done testing
119dda6 to
9d624e7
Compare
Scottmitch
left a comment
There was a problem hiding this comment.
few more comments/questions
There was a problem hiding this comment.
can we create this request when it is needed? Otherwise we risk leaking this reference counted object.
There was a problem hiding this comment.
1 * 1024 * 1024 -> 1024 * 1024?
There was a problem hiding this comment.
nit: consider simplifying these conditional statements
return status == CertificateStatus.GOOD && certSerial.equals(ocspSerial);There was a problem hiding this comment.
nit: make this a javadoc comment so the link shows up in IDEs
There was a problem hiding this comment.
is the intention to dynamically toggle this value? For example is it sufficient to just add a method to the SslContextBuilder and configure it in the constructor of this method.
For example isn't there some assumptions from the SSL's perspective that the SSL_CTX has some callback set? We may be OK if we only ever allow a transition from disabled -> enabled, but it may introduce undefined behavior if we forget this relationship and add a enabled -> disabled transition.
There was a problem hiding this comment.
We can move it into SslContextBuilder and it's only allowed to transition from disabled to enabled.
There was a problem hiding this comment.
consider creating this exception outside the scope and check for assertSame below:
final OcspTestException clientException = new OcspTestException("testClientException");
...
@Override
public boolean verify(byte[] response) throws Exception {
throw clientException;
}
...
assertSame(clientException, causeRef.get());
pom.xml
Outdated
There was a problem hiding this comment.
reminder: we need to update this at some point. (/cc @normanmaurer )
b85844d to
765ad9c
Compare
|
@Scottmitch moved the |
Scottmitch
left a comment
There was a problem hiding this comment.
just a few more comments ... then rebase/squash and I will merge.
There was a problem hiding this comment.
nit: consider simplifying to the following:
return status == CertificateStatus.GOOD && certSerial.equals(ocspSerial);There was a problem hiding this comment.
the engine must be released at some point. SslHandler typically takes ownership, but it must be interred into a pipeline ignorer to do so.
There was a problem hiding this comment.
the engine must be released at some point. SslHandler typically takes ownership, but it must be interred into a pipeline ignorer to do so.
There was a problem hiding this comment.
As @Scottmitch said... add engine.release() to the finally block
There was a problem hiding this comment.
ClientOcspContext cannot be found. Update this comment plz.
normanmaurer
left a comment
There was a problem hiding this comment.
Please address my comments. Also please note we should not merge this until a new netty-tcnative version is released that supports the needed API
There was a problem hiding this comment.
As @Scottmitch said... add engine.release() to the finally block
There was a problem hiding this comment.
As @Scottmitch said... add engine.release() to the finally block
There was a problem hiding this comment.
getBytes(CharsetUtil.US_ASCII)
There was a problem hiding this comment.
You can also just remove the message... I think its clear that something went wrong here ;)
There was a problem hiding this comment.
merge these two branches:
if (((SslHandshakeCompletionEvent) evt).isSuccess() && !verify(ctx, engine)) {
throw OCSP_VERIFICATION_EXCEPTION;
}There was a problem hiding this comment.
add the provider to the message
There was a problem hiding this comment.
Please reduce the scope of the synchronized block:
public void setOcspResponse(byte[] response) {
if (!enableOcsp) {
throw new IllegalStateException("OCSP stapling is not enabled");
}
if (clientMode) {
throw new IllegalStateException("Not a server SSLEngine");
}
synchronized(this) {
SSL.setOcspResponse(ssl, response);
}
}There was a problem hiding this comment.
765ad9c to
432e439
Compare
|
@Scottmitch @normanmaurer - done! |
Scottmitch
left a comment
There was a problem hiding this comment.
great work @rkapsi ! We will merge when tcnative is released.
|
@normanmaurer - Note the necessary changes for tcnative have already been merged. So if we plan to update the version of tcnative for the next version of Netty we could merge this now. @rkapsi - Please rebase to resolve recently introduced conflicts. |
432e439 to
a93e9e4
Compare
|
@Scottmitch - done. |
|
the build failed due to a recent API change in netty-tcnative. I merged the corresponding Netty PR #6581 which should resolve the issue. Can you rebase again? |
netty/netty-tcnative#215 Motivation OCSP stapling (formally known as TLS Certificate Status Request extension) is alternative approach for checking the revocation status of X.509 Certificates. Servers can preemptively fetch the OCSP response from the CA's responder, cache it for some period of time, and pass it along during (a.k.a. staple) the TLS handshake. The client no longer has to reach out on its own to the CA to check the validity of a cetitficate. Some of the key benefits are: 1) Speed. The client doesn't have to crosscheck the certificate. 2) Efficiency. The Internet is no longer DDoS'ing the CA's OCSP responder servers. 3) Safety. Less operational dependence on the CA. Certificate owners can sustain short CA outages. 4) Privacy. The CA can lo longer track the users of a certificate. https://en.wikipedia.org/wiki/OCSP_stapling https://letsencrypt.org/2016/10/24/squarespace-ocsp-impl.html Modifications https://www.openssl.org/docs/man1.0.2/ssl/SSL_set_tlsext_status_type.html Result High-level API to enable OCSP stapling
a93e9e4 to
27450c9
Compare
|
@Scottmitch - done. |
|
Great work! Thanks for your patience @rkapsi! |
OCSP stapling support for Netty using netty-tcnative.
netty/netty-tcnative#215
Motivation
OCSP stapling (formally known as TLS Certificate Status Request extension) is alternative approach for checking the revocation status of X.509 Certificates. Servers can preemptively fetch the OCSP response from the CA's responder, cache it for some period of time, and pass it along during (a.k.a. staple) the TLS handshake. The client no longer has to reach out on its own to the CA to check the validity of a cetitficate. Some of the key benefits are:
https://en.wikipedia.org/wiki/OCSP_stapling
https://letsencrypt.org/2016/10/24/squarespace-ocsp-impl.html
Modifications
https://www.openssl.org/docs/man1.0.2/ssl/SSL_set_tlsext_status_type.html
Result
High-level API to enable OCSP stapling