-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Auto retry in shutdown #6340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto retry in shutdown #6340
Conversation
|
This PR fixes CPython's Thanks @mattcaswell |
|
I wonder if the "we don't support that" comment should be in CHANGES or manpage? |
test/sslapitest.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When reading the manual, it's wasn't really clear to me that the first SSL_shutdown(serverssl) would return 0. But I guess this is also not normal way to actually do a shutdown, both sides deciding at the same time to do a shutdown. I guess at least the manual can be clarified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - normally you wouldn't both simultaneously decide to shutdown at the same time. This is just for a test. Clarifications on the SSL_shutdown() docs are probably outside of the scope of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess not having seen it, it just falls under the "first to close the connection"
|
Can you also add a test where after the first SSL_shutdown(), the other side does an SSL_read() + SSL_write() before also calling SSL_shutdown()? The first side should then to an SSL_read() to get what was written with that SSL_write(). It would also be useful to check that if SSL_shutdown() was called again before that SSL_read() that it returns <= 0, I want to make sure that no data is lost if the shutdown says it's closed successfully. |
TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org>
TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org> (cherry picked from commit 529525f) Co-authored-by: Christian Heimes <christian@python.org>
TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org>
TLS 1.3 behaves slightly different than TLS 1.2. Session tickets and TLS client cert auth are now handled after the initialy handshake. Tests now either send/recv data to trigger session and client certs. Or tests ignore ConnectionResetError / BrokenPipeError on the server side to handle clients that force-close the socket fd. To test TLS 1.3, OpenSSL 1.1.1-pre7-dev (git master + OpenSSL PR openssl/openssl#6340) is required. Signed-off-by: Christian Heimes <christian@python.org> (cherry picked from commit 529525f)
7bd7619 to
fbc16d1
Compare
Ok - good idea, I added tests for those scenarios. The only thing I'm not sure about is your last statement. If you call SSL_shutdown() again it will skip over any pending data until it gets the close_notify. I think this is consistent with historic behaviour. |
|
I think that historic behaviour is just wrong. What is the point of doing this bidirectional shutdown if not to make sure that both sides agree on what data has been sent? |
|
Hmm, SSL_shutdown() really doesn't have a way to indicate that there is still unprocessed data that should be read first. |
|
We actually recently changed the manpage to indicate that it can return <0 in that case, so I'm not sure why it's not doing it now, or what error it set. |
This PR changes the behaviour so that it honours SSL_MODE_AUTO_RETRY. Without that mode it ditches the data and returns -1. With that mode it ditches the data and tries again. |
|
Perhaps we should change the advice so that, when you call SSL_shutdown(), if you get 0 back then you should call SSL_read() until you get SSL_ERROR_ZERO_RETURN, and SSL_get_shutdown() indicates the connection is fully closed. |
|
I think that's what the manpage says now (in master) |
|
So I think the behaviour we want is that if there are non-application data records, just process them. If there is still application data we should return -1 and set SSL_ERROR_SYSCALL. |
If we return -1 when there is still application data to read I think that will break a lot of stuff. I imagine its fairly common to just call |
|
As far as I know, the history and current behaviour has been to return -1 when data is still available. This PR changes it to silently throw it away instead. |
No, we always threw it away. This PR changes it to auto-retry afterwards. |
|
Just to follow-up on my earlier comments: we've since moved BoringSSL to treating unexpected application data as a proper fatal error (no skipping over logic). We continue to skip over NewSessionTicket. It seems to be working out fine. |
|
I think it's fine, but I did not have time to actually review it.
|
ssl/record/rec_layer_s3.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an extra space there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
I really don't understand this code good enough to review it. |
ssl/record/rec_layer_s3.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, if auto retry is off you get a fatal error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to make sure to return SSL_ERROR_WANT_READ in case auto retry is off and the type was handshake data, like it's done below. That probably also means returning -1 for compatibility reasons.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also seems trivial to add a call to SSLfatal() in case the type was application data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a handshake message then it should probably be processed, instead of discarded?
I mean something like that (or maybe more specific which message exactly are expected):
diff --git a/ssl/record/rec_layer_s3.c b/ssl/record/rec_layer_s3.c
index 61010f4..0b983df 100644
--- a/ssl/record/rec_layer_s3.c
+++ b/ssl/record/rec_layer_s3.c
@@ -1583,14 +1583,6 @@ int ssl3_read_bytes(SSL *s, int type, int *recvd_type, unsigned char *buf,
goto start;
}
- if (s->shutdown & SSL_SENT_SHUTDOWN) { /* but we have not received a
- * shutdown */
- s->rwstate = SSL_NOTHING;
- SSL3_RECORD_set_length(rr, 0);
- SSL3_RECORD_set_read(rr);
- return 0;
- }
-
if (SSL3_RECORD_get_type(rr) == SSL3_RT_CHANGE_CIPHER_SPEC) {
SSLfatal(s, SSL_AD_UNEXPECTED_MESSAGE, SSL_F_SSL3_READ_BYTES,
SSL_R_CCS_RECEIVED_EARLY);
@@ -1644,6 +1636,14 @@ int ssl3_read_bytes(SSL *s, int type, int *recvd_type, unsigned char *buf,
goto start;
}
+ if (s->shutdown & SSL_SENT_SHUTDOWN) { /* but we have not received a
+ * shutdown */
+ s->rwstate = SSL_NOTHING;
+ SSL3_RECORD_set_length(rr, 0);
+ SSL3_RECORD_set_read(rr);
+ return 0;
+ }
+
switch (SSL3_RECORD_get_type(rr)) {
default:
/*
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want to make sure to return SSL_ERROR_WANT_READ in case auto retry is off and the type was handshake data, like it's done below. That probably also means returning -1 for compatibility reasons.
Yes, this makes sense. I made that change.
It also seems trivial to add a call to SSLfatal() in case the type was application data.
I think this could be a breaking change if we did that. At the moment you can just keep calling SSL_shutdown(), ignoring any errors, until you eventually get a 1 return. If you call SSLfatal() it marks the SSL object as unusable, and any subsequent calls to SSL_shutdown() will immediately fail, i.e. you will never get a 1 return. We could just call SSLerr - but that might risk lots of stale errors on the queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we previously we considered that receiving application data during shutdown was a non fatal, recoverable, error. You want to change it into a fatal error. That's a behaviour change not a bug fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My recollection is that SSL_shutdown returns a SSL_ERROR_SSL.
And if an application is retrying SSL_shutdown it is unpredictable if
the SSL will recover or crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually returns SSL_ERROR_SYSCALL, because nobody bothered to actually say that there was an error, so we just return SSL_ERROR_SYSCALL. But both are fatal errors. First returning a fatal error and then returning it was successful after all doesn't make sense. Calling SSLfatal() should turn it into a SSL_ERROR_SSL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm....I suppose existing application code has to be able to handle fatal errors to deal with stuff like the underlying TCP connection closing without a close_notify...so this would appear the same as if that occurred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this case a fatal error as requested.
ssl/record/rec_layer_s3.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this code is moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
me neither.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are in the half-shutdown state we are discarding handshake messages. Therefore we don't want to fill up the handshake fragment storage with data that we plan to discard. Therefore we move the fragment storage code until after decisions have been made about whether to discard the handshake data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If 2 people don't understand this, it at least seems to lack a comment explaining this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment.
|
New commit pushed addressing comments (where appropriate). |
If we've sent a close_notify and we're waiting for one back we drop incoming records until we see the close_notify we're looking for. If SSL_MODE_AUTO_RETRY is on, then we should immediately try and read the next record. Fixes openssl#6262
In the case where we are shutdown for writing and awaiting a close_notify back from a subsequent SSL_shutdown() call we skip over handshake data that is received. This should not be treated as an error - instead it should be signalled with SSL_ERROR_WANT_READ.
Currently if you encounter application data while waiting for a close_notify from the peer, and you have called SSL_shutdown() then you will get a -1 return (fatal error) and SSL_ERROR_SYSCALL from SSL_get_error(). This isn't accurate (it should be SSL_ERROR_SSL) and isn't persistent (you can call SSL_shutdown() again and it might then work). We change this into a proper fatal error that is persistent.
fb0ee9c to
ab82e29
Compare
|
Updated pushed to address the outstanding comments. I also rebased to resolve a conflict with master. |
bernd-edlinger
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
test/sslapitest.c
Outdated
| * client needs to read the close_notify sent by the server. | ||
| */ | ||
| if (tst == 5 && !TEST_int_eq(SSL_shutdown(clientssl), -1)) | ||
| goto end; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the above 2 lines should get removed, tst == 5 is always false and is covered in the else below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| SSLfatal(s, SSL_AD_NO_ALERT, SSL_F_SSL3_READ_BYTES, | ||
| SSL_R_APPLICATION_DATA_AFTER_CLOSE_NOTIFY); | ||
| } | ||
| return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know -1 and 0 are really documented as the same, but old documentation was confusing. I suggest that we return -1 in the if and 0 on the else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value from this function does not necessarily translate to the return value from the top level function. -1 and 0 are documented as the same for SSL_read(), but if you've called SSL_read() then you won't get here anyway. (Almost) everywhere else in this function we return -1 after SSLfatal(), so it seems to make more sense to me to follow that. It makes little difference in practice since ssl3_shutdown() ignores the return value anyway.
| rbio = SSL_get_rbio(s); | ||
| BIO_clear_retry_flags(rbio); | ||
| BIO_set_retry_read(rbio); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always SSL3_RT_APPLICATION_DATA in the else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it isn't SSL3_RT_HANDSHAKE (because we handled that in the main branch of the if), and it isn't SSL3_RT_ALERT (because that is handled in the if immediately preceding this one). If it's anything other than SSL3_RT_APPLICATION_DATA then it's illegal in this context.
|
Fix up commit pushed. |
|
Pushed! Thanks. |
If we've sent a close_notify and we're waiting for one back we drop incoming records until we see the close_notify we're looking for. If SSL_MODE_AUTO_RETRY is on, then we should immediately try and read the next record. Fixes #6262 Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)
Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)
In the case where we are shutdown for writing and awaiting a close_notify back from a subsequent SSL_shutdown() call we skip over handshake data that is received. This should not be treated as an error - instead it should be signalled with SSL_ERROR_WANT_READ. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)
Currently if you encounter application data while waiting for a close_notify from the peer, and you have called SSL_shutdown() then you will get a -1 return (fatal error) and SSL_ERROR_SYSCALL from SSL_get_error(). This isn't accurate (it should be SSL_ERROR_SSL) and isn't persistent (you can call SSL_shutdown() again and it might then work). We change this into a proper fatal error that is persistent. Reviewed-by: Bernd Edlinger <bernd.edlinger@hotmail.de> Reviewed-by: Kurt Roeckx <kurt@roeckx.be> (Merged from #6340)
If we've sent a close_notify and we're waiting for one back we drop
incoming records until we see the close_notify we're looking for. If
SSL_MODE_AUTO_RETRY is on, then we should immediately try and read the
next record.
Fixes #6262