Auto retry in shutdown #6340

mattcaswell · 2018-05-23T11:28:20Z

If we've sent a close_notify and we're waiting for one back we drop
incoming records until we see the close_notify we're looking for. If
SSL_MODE_AUTO_RETRY is on, then we should immediately try and read the
next record.

Fixes #6262

tiran · 2018-05-23T13:23:39Z

This PR fixes CPython's SSLSocket.unwrap() for TLS 1.3 connection. Before the fix, unwrap() failed, when a client only performed SSL_write() and a TLS session ticket was still on the wire

Thanks @mattcaswell

richsalz · 2018-05-23T15:08:23Z

I wonder if the "we don't support that" comment should be in CHANGES or manpage?

kroeckx · 2018-05-23T16:03:31Z

test/sslapitest.c

When reading the manual, it's wasn't really clear to me that the first SSL_shutdown(serverssl) would return 0. But I guess this is also not normal way to actually do a shutdown, both sides deciding at the same time to do a shutdown. I guess at least the manual can be clarified.

Yes - normally you wouldn't both simultaneously decide to shutdown at the same time. This is just for a test. Clarifications on the SSL_shutdown() docs are probably outside of the scope of this PR.

I guess not having seen it, it just falls under the "first to close the connection"

kroeckx · 2018-05-23T16:23:08Z

Can you also add a test where after the first SSL_shutdown(), the other side does an SSL_read() + SSL_write() before also calling SSL_shutdown()? The first side should then to an SSL_read() to get what was written with that SSL_write(). It would also be useful to check that if SSL_shutdown() was called again before that SSL_read() that it returns <= 0, I want to make sure that no data is lost if the shutdown says it's closed successfully.

TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org>

TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org> (cherry picked from commit 529525f) Co-authored-by: Christian Heimes <christian@python.org>

TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org>

TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org> (cherry picked from commit 529525f)

mattcaswell · 2018-05-25T16:22:52Z

Can you also add a test where after the first SSL_shutdown(), the other side does an SSL_read() + SSL_write() before also calling SSL_shutdown()? The first side should then to an SSL_read() to get what was written with that SSL_write(). It would also be useful to check that if SSL_shutdown() was called again before that SSL_read() that it returns <= 0, I want to make sure that no data is lost if the shutdown says it's closed successfully.

Ok - good idea, I added tests for those scenarios. The only thing I'm not sure about is your last statement. If you call SSL_shutdown() again it will skip over any pending data until it gets the close_notify. I think this is consistent with historic behaviour.

kroeckx · 2018-05-25T16:45:36Z

I think that historic behaviour is just wrong. What is the point of doing this bidirectional shutdown if not to make sure that both sides agree on what data has been sent?

kroeckx · 2018-05-25T17:09:46Z

Hmm, SSL_shutdown() really doesn't have a way to indicate that there is still unprocessed data that should be read first.

kroeckx · 2018-05-25T17:12:57Z

We actually recently changed the manpage to indicate that it can return <0 in that case, so I'm not sure why it's not doing it now, or what error it set.

mattcaswell · 2018-05-25T17:15:04Z

We actually recently changed the manpage to indicate that it can return <0 in that case, so I'm not sure why it's not doing it now, or what error it set.

This PR changes the behaviour so that it honours SSL_MODE_AUTO_RETRY. Without that mode it ditches the data and returns -1. With that mode it ditches the data and tries again.

mattcaswell · 2018-05-25T17:16:39Z

Perhaps we should change the advice so that, when you call SSL_shutdown(), if you get 0 back then you should call SSL_read() until you get SSL_ERROR_ZERO_RETURN, and SSL_get_shutdown() indicates the connection is fully closed.

kroeckx · 2018-05-25T17:23:40Z

I think that's what the manpage says now (in master)

kroeckx · 2018-05-26T08:42:55Z

So I think the behaviour we want is that if there are non-application data records, just process them. If there is still application data we should return -1 and set SSL_ERROR_SYSCALL.

mattcaswell · 2018-05-29T09:08:49Z

So I think the behaviour we want is that if there are non-application data records, just process them. If there is still application data we should return -1 and set SSL_ERROR_SYSCALL.

If we return -1 when there is still application data to read I think that will break a lot of stuff. I imagine its fairly common to just call SSL_shutdown() in a loop until it returns 1. I agree that it probably should have been designed that way in the first place, but I'm not sure we can change it now.

kroeckx · 2018-05-29T15:55:59Z

As far as I know, the history and current behaviour has been to return -1 when data is still available. This PR changes it to silently throw it away instead.

mattcaswell · 2018-05-29T15:58:09Z

As far as I know, the history and current behaviour has been to return -1 when data is still available. This PR changes it to silently throw it away instead.

No, we always threw it away. This PR changes it to auto-retry afterwards.

davidben · 2018-06-15T16:30:49Z

Just to follow-up on my earlier comments: we've since moved BoringSSL to treating unexpected application data as a proper fatal error (no skipping over logic). We continue to skip over NewSessionTicket. It seems to be working out fine.

kroeckx · 2018-06-15T16:35:05Z

I think it's fine, but I did not have time to actually review it.

kroeckx · 2018-06-16T12:47:18Z

ssl/record/rec_layer_s3.c

There is an extra space there

kroeckx · 2018-06-16T13:08:21Z

I really don't understand this code good enough to review it.

kroeckx · 2018-06-16T18:14:20Z

ssl/record/rec_layer_s3.c

If I understand correctly, if auto retry is off you get a fatal error.

I think you want to make sure to return SSL_ERROR_WANT_READ in case auto retry is off and the type was handshake data, like it's done below. That probably also means returning -1 for compatibility reasons.

It also seems trivial to add a call to SSLfatal() in case the type was application data.

If this is a handshake message then it should probably be processed, instead of discarded?
I mean something like that (or maybe more specific which message exactly are expected):

diff --git a/ssl/record/rec_layer_s3.c b/ssl/record/rec_layer_s3.c index 61010f4..0b983df 100644 --- a/ssl/record/rec_layer_s3.c +++ b/ssl/record/rec_layer_s3.c @@ -1583,14 +1583,6 @@ int ssl3_read_bytes(SSL *s, int type, int *recvd_type, unsigned char *buf, goto start; } - if (s->shutdown & SSL_SENT_SHUTDOWN) { /* but we have not received a - * shutdown */ - s->rwstate = SSL_NOTHING; - SSL3_RECORD_set_length(rr, 0); - SSL3_RECORD_set_read(rr); - return 0; - } - if (SSL3_RECORD_get_type(rr) == SSL3_RT_CHANGE_CIPHER_SPEC) { SSLfatal(s, SSL_AD_UNEXPECTED_MESSAGE, SSL_F_SSL3_READ_BYTES, SSL_R_CCS_RECEIVED_EARLY); @@ -1644,6 +1636,14 @@ int ssl3_read_bytes(SSL *s, int type, int *recvd_type, unsigned char *buf, goto start; } + if (s->shutdown & SSL_SENT_SHUTDOWN) { /* but we have not received a + * shutdown */ + s->rwstate = SSL_NOTHING; + SSL3_RECORD_set_length(rr, 0); + SSL3_RECORD_set_read(rr); + return 0; + } + switch (SSL3_RECORD_get_type(rr)) { default: /*

I think you want to make sure to return SSL_ERROR_WANT_READ in case auto retry is off and the type was handshake data, like it's done below. That probably also means returning -1 for compatibility reasons.

Yes, this makes sense. I made that change.

It also seems trivial to add a call to SSLfatal() in case the type was application data.

I think this could be a breaking change if we did that. At the moment you can just keep calling SSL_shutdown(), ignoring any errors, until you eventually get a 1 return. If you call SSLfatal() it marks the SSL object as unusable, and any subsequent calls to SSL_shutdown() will immediately fail, i.e. you will never get a 1 return. We could just call SSLerr - but that might risk lots of stale errors on the queue.

Because we previously we considered that receiving application data during shutdown was a non fatal, recoverable, error. You want to change it into a fatal error. That's a behaviour change not a bug fix.

My recollection is that SSL_shutdown returns a SSL_ERROR_SSL.
And if an application is retrying SSL_shutdown it is unpredictable if
the SSL will recover or crash.

It actually returns SSL_ERROR_SYSCALL, because nobody bothered to actually say that there was an error, so we just return SSL_ERROR_SYSCALL. But both are fatal errors. First returning a fatal error and then returning it was successful after all doesn't make sense. Calling SSLfatal() should turn it into a SSL_ERROR_SSL.

Hmmm....I suppose existing application code has to be able to handle fatal errors to deal with stuff like the underlying TCP connection closing without a close_notify...so this would appear the same as if that occurred.

I made this case a fatal error as requested.

kroeckx · 2018-06-16T18:35:50Z

ssl/record/rec_layer_s3.c

I don't understand why this code is moved.

me neither.

If we are in the half-shutdown state we are discarding handshake messages. Therefore we don't want to fill up the handshake fragment storage with data that we plan to discard. Therefore we move the fragment storage code until after decisions have been made about whether to discard the handshake data.

If 2 people don't understand this, it at least seems to lack a comment explaining this.

I added a comment.

mattcaswell · 2018-06-21T13:10:43Z

New commit pushed addressing comments (where appropriate).

If we've sent a close_notify and we're waiting for one back we drop incoming records until we see the close_notify we're looking for. If SSL_MODE_AUTO_RETRY is on, then we should immediately try and read the next record. Fixes openssl#6262

In the case where we are shutdown for writing and awaiting a close_notify back from a subsequent SSL_shutdown() call we skip over handshake data that is received. This should not be treated as an error - instead it should be signalled with SSL_ERROR_WANT_READ.

Currently if you encounter application data while waiting for a close_notify from the peer, and you have called SSL_shutdown() then you will get a -1 return (fatal error) and SSL_ERROR_SYSCALL from SSL_get_error(). This isn't accurate (it should be SSL_ERROR_SSL) and isn't persistent (you can call SSL_shutdown() again and it might then work). We change this into a proper fatal error that is persistent.

mattcaswell · 2018-06-25T14:08:47Z

Updated pushed to address the outstanding comments. I also rebased to resolve a conflict with master.

bernd-edlinger

LGTM

kroeckx · 2018-06-25T19:01:51Z

test/sslapitest.c

+         * client needs to read the close_notify sent by the server.
+         */
+        if (tst == 5 && !TEST_int_eq(SSL_shutdown(clientssl), -1))
+            goto end;


I think the above 2 lines should get removed, tst == 5 is always false and is covered in the else below.

kroeckx · 2018-06-25T19:07:27Z

ssl/record/rec_layer_s3.c

+            SSLfatal(s, SSL_AD_NO_ALERT, SSL_F_SSL3_READ_BYTES,
+                     SSL_R_APPLICATION_DATA_AFTER_CLOSE_NOTIFY);
+        }
+        return -1;


I know -1 and 0 are really documented as the same, but old documentation was confusing. I suggest that we return -1 in the if and 0 on the else.

The return value from this function does not necessarily translate to the return value from the top level function. -1 and 0 are documented as the same for SSL_read(), but if you've called SSL_read() then you won't get here anyway. (Almost) everywhere else in this function we return -1 after SSLfatal(), so it seems to make more sense to me to follow that. It makes little difference in practice since ssl3_shutdown() ignores the return value anyway.

kroeckx · 2018-06-25T19:10:57Z

ssl/record/rec_layer_s3.c

+            rbio = SSL_get_rbio(s);
+            BIO_clear_retry_flags(rbio);
+            BIO_set_retry_read(rbio);
+        } else {


Is this always SSL3_RT_APPLICATION_DATA in the else?

Well, it isn't SSL3_RT_HANDSHAKE (because we handled that in the main branch of the if), and it isn't SSL3_RT_ALERT (because that is handled in the if immediately preceding this one). If it's anything other than SSL3_RT_APPLICATION_DATA then it's illegal in this context.

… shutdown

mattcaswell · 2018-06-26T13:15:18Z

Fix up commit pushed.

mattcaswell · 2018-06-27T09:08:11Z

Pushed! Thanks.

If we've sent a close_notify and we're waiting for one back we drop incoming records until we see the close_notify we're looking for. If SSL_MODE_AUTO_RETRY is on, then we should immediately try and read the next record. Fixes #6262 Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)

In the case where we are shutdown for writing and awaiting a close_notify back from a subsequent SSL_shutdown() call we skip over handshake data that is received. This should not be treated as an error - instead it should be signalled with SSL_ERROR_WANT_READ. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)

Currently if you encounter application data while waiting for a close_notify from the peer, and you have called SSL_shutdown() then you will get a -1 return (fatal error) and SSL_ERROR_SYSCALL from SSL_get_error(). This isn't accurate (it should be SSL_ERROR_SSL) and isn't persistent (you can call SSL_shutdown() again and it might then work). We change this into a proper fatal error that is persistent. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)

mattcaswell mentioned this pull request May 23, 2018

New session ticket breaks bidirectional shutdown of TLS 1.3 connection #6262

Closed

tiran mentioned this pull request May 23, 2018

bpo-30437: Fix SSL_shutdown for TLS 1.3 python/cpython#6977

Closed

njsmith mentioned this pull request May 23, 2018

Work around openssl's weird SSL_shutdown semantics by calling SSL_read during shutdown python-trio/trio#536

Open

richsalz approved these changes May 23, 2018

View reviewed changes

kroeckx reviewed May 23, 2018

View reviewed changes

tiran mentioned this pull request May 23, 2018

bpo-33618: Enable TLS 1.3 in tests python/cpython#7079

Merged

miss-islington mentioned this pull request May 23, 2018

[3.7] bpo-33618: Enable TLS 1.3 in tests (GH-7079) python/cpython#7082

Merged

mattcaswell force-pushed the auto-retry-in-shutdown branch 2 times, most recently from 7bd7619 to fbc16d1 Compare May 25, 2018 16:13

kroeckx reviewed Jun 16, 2018

View reviewed changes

mattcaswell added 4 commits June 25, 2018 15:03

Auto retry if we ditch records during shutdown

115bffa

If we've sent a close_notify and we're waiting for one back we drop incoming records until we see the close_notify we're looking for. If SSL_MODE_AUTO_RETRY is on, then we should immediately try and read the next record. Fixes openssl#6262

Add a bi-directional shutdown test

9f7c119

mattcaswell force-pushed the auto-retry-in-shutdown branch from fb0ee9c to ab82e29 Compare June 25, 2018 14:08

bernd-edlinger approved these changes Jun 25, 2018

View reviewed changes

kroeckx reviewed Jun 25, 2018

View reviewed changes

fixup! Return a fatal error if application data is encountered during…

51b8f07

… shutdown

kroeckx approved these changes Jun 27, 2018

View reviewed changes

kroeckx added the approval: done This pull request has the required number of approvals label Jun 27, 2018

mattcaswell closed this Jun 27, 2018

levitte pushed a commit that referenced this pull request Jun 27, 2018

Add a bi-directional shutdown test

c748834

Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)

kroeckx mentioned this pull request Jul 7, 2018

foolscap problem with openssl 1.1.1-dev #6228

Closed

sebastianas mentioned this pull request Aug 6, 2018

test suite hangs against openssl 1.1.1-pre8 puma/puma#1609

Closed

Uh oh!

Auto retry in shutdown #6340

Auto retry in shutdown #6340

Uh oh!

Conversation

mattcaswell commented May 23, 2018

Uh oh!

tiran commented May 23, 2018

Uh oh!

richsalz commented May 23, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kroeckx commented May 23, 2018

Uh oh!

mattcaswell commented May 25, 2018

Uh oh!

kroeckx commented May 25, 2018

Uh oh!

kroeckx commented May 25, 2018

Uh oh!

kroeckx commented May 25, 2018

Uh oh!

mattcaswell commented May 25, 2018

Uh oh!

mattcaswell commented May 25, 2018

Uh oh!

kroeckx commented May 25, 2018

Uh oh!

kroeckx commented May 26, 2018

Uh oh!

mattcaswell commented May 29, 2018

Uh oh!

kroeckx commented May 29, 2018

Uh oh!

mattcaswell commented May 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davidben commented Jun 15, 2018

Uh oh!

kroeckx commented Jun 15, 2018 via email

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kroeckx commented Jun 16, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattcaswell commented May 29, 2018 •

edited

Loading