Skip to content

tls: fix error-reporting in doSynchronousVerifyCertChain#23319

Merged
ggreenway merged 2 commits intoenvoyproxy:mainfrom
davidben:sync-verify-errors
Oct 3, 2022
Merged

tls: fix error-reporting in doSynchronousVerifyCertChain#23319
ggreenway merged 2 commits intoenvoyproxy:mainfrom
davidben:sync-verify-errors

Conversation

@davidben
Copy link
Copy Markdown
Contributor

@davidben davidben commented Sep 29, 2022

This PR is a preparatory fix for PR #23320.

The intent was to upload a pair of dependent PRs. I've made them into separate PRs because it looked like you all prefer one commit per PR (since PR descriptions ask for a commit message), and it seemed worth keeping the changes ultimately separate. Let me know if this isn't the preferred workflow.


Commit Message:

This fixes a bug in the pre-tls_async_cert_validation codepath. The tls_async_cert_validation codepath is unaffected.

Per both OpenSSL and BoringSSL documentation, if you use SSL_CTX_set_cert_verify_callback, the library expects to find the error code in X509_STORE_CTX. This error determines the TLS alert. X509_verify_cert does this for you, but Envoy's callback was not doing this when doSynchronousVerifyCertChain runs the extra checks in verifyCertAndUpdateStatus. This change fixes this and sets the generic X509_V_ERR_APPLICATION_VERIFICATION.

Before this commit, the error would usually be X509_V_OK, but sometimes it would be something else. Envoy implements the
allow_expired_certificate using the X509_STORE_CTX verify callback, via ignoreCertificateExpirationCallback. This callback, when it suppresses an error, will sometimes leave stray errors in X509_STORE_CTX. These are usually benign, but if Envoy then does extra checks, which fail, and then report some other failure, it will inadvertantly return the stray error to BoringSSL!

That is why FailedClientCertAllowExpiredBadHashVerification expected a SSLV3_ALERT_CERTIFICATE_EXPIRED alert. The alert makes no sense because the connection sets allow_expired_certificate! Rather, the test actually caught an Envoy bug, but #6018 mistakenly codified it as expected behavior, rather than correctly diagnosing it as a bug. This commit fixes that bug.

Unfortunately, OpenSSL/BoringSSL map X509_V_ERR_APPLICATION_VERIFICATION to handshake_failure rather than certificate_unknown, so this does change the alert returned by the pre-tls_async_cert_validation codepath in other cases. (certificate_unknown came from X509_V_OK hitting the default in a switch-case.) Arguably certificate_unknown is a better alert than handshake_failure, so I'm slightly inclined to fix BoringSSL to map that differently. But since Envoy needs to support an old BoringSSL, I've just updated the expectations for now. Either way, tls_async_cert_validation gives full control over the alert and will make this moot.

Signed-off-by: David Benjamin davidben@google.com

Additional Description:
Risk Level: Low
Testing: bazelist test //test/extensions/transport_sockets/tls
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A

This fixes a bug in the pre-tls_async_cert_validation codepath. The
tls_async_cert_validation codepath is unaffected.

Per both OpenSSL and BoringSSL documentation, if you use
SSL_CTX_set_cert_verify_callback, the library expects to find the error
code in X509_STORE_CTX. This error determines the TLS alert.
X509_verify_cert does this for you, but Envoy's callback was not doing
this when doSynchronousVerifyCertChain runs the extra checks in
verifyCertAndUpdateStatus. This change fixes this and sets the generic
X509_V_ERR_APPLICATION_VERIFICATION.

Before this commit, the error would usually be X509_V_OK, but sometimes
it would be something else. Envoy implements the
allow_expired_certificate using the X509_STORE_CTX verify callback, via
ignoreCertificateExpirationCallback. This callback, when it suppresses
an error, will sometimes leave stray errors in X509_STORE_CTX. These are
usually benign, *but* if Envoy then does extra checks, which fail, and
then report some *other* failure, it will inadvertantly return the stray
error to BoringSSL!

That is why FailedClientCertAllowExpiredBadHashVerification expected a
SSLV3_ALERT_CERTIFICATE_EXPIRED alert. The alert makes no sense because
the connection sets allow_expired_certificate! Rather, the test actually
caught an Envoy bug, but envoyproxy#6018
mistakenly codified it as expected behavior, rather than correctly
diagnosing it as a bug. This commit fixes that bug.

Unfortunately, OpenSSL/BoringSSL map X509_V_ERR_APPLICATION_VERIFICATION
to handshake_failure rather than certificate_unknown, so this does
change the alert returned by the pre-tls_async_cert_validation codepath
in other cases. (certificate_unknown came from X509_V_OK hitting the
default in a switch-case.) Arguably certificate_unknown is a better
alert than handshake_failure, so I'm slightly inclined to fix BoringSSL
to map that differently. But since Envoy needs to support an old
BoringSSL, I've just updated the expectations for now. Either way,
tls_async_cert_validation gives full control over the alert and will
make this moot.

Signed-off-by: David Benjamin <davidben@google.com>
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Can you please add a release note here?

Signed-off-by: David Benjamin <davidben@google.com>
@davidben
Copy link
Copy Markdown
Contributor Author

davidben commented Oct 3, 2022

Can you please add a release note here?

Done. Let me know if did that wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants