quiche: support downstream using SDS#15821
Conversation
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
|
I'll wait until you have CI sorted. |
| QuicTransportSocketFactoryStats stats_; | ||
|
|
||
| private: | ||
| std::unique_ptr<Ssl::ContextConfig> context_config_; |
There was a problem hiding this comment.
I don't have the full picture regarding the client context. However, if you can put context_config_ to derived QuicServerTransportSocketFactory and QuicClientTransportSocketFactory, the dynamic cast can be eliminated.
There was a problem hiding this comment.
Actually I moved it from subclasses to the base class so that they can have the SDS update callback setup in the base class :)
Doing so is just to have the same behavior on both endpoints. Indeed it brings up the need of dynamic_cast in subclasses. I don't have strong preference for either way.
There was a problem hiding this comment.
After thinking about it more, I changed the ownership to subclasses as the client subclass may need to do extra work in the future. Thus avoiding the dynamic_cast. Thanks for pointing out the issue here!
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
|
/retest |
|
Retrying Azure Pipelines: |
Signed-off-by: Dan Zhang <danzh@google.com>
alyssawilk
left a comment
There was a problem hiding this comment.
main merge should fix verify_examples, sorry for the breakage
| getSdsTestsParams(bool test_quic) { | ||
| std::vector<std::tuple<Network::Address::IpVersion, Grpc::ClientType, bool>> ret; | ||
| for (auto ip_version : TestEnvironment::getIpVersionsForTest()) { | ||
| for (auto grpc_type : {Grpc::ClientType::EnvoyGrpc, Grpc::ClientType::GoogleGrpc}) { |
There was a problem hiding this comment.
Let's add a utility for this, like getIpVersionsFor test, which getsGrpcVersionsForTest based on what's compiled in.
There was a problem hiding this comment.
done, also updated grpc integration tests to use that.
|
|
||
| ret.push_back({ip_version, grpc_type, false}); | ||
| if (test_quic) { | ||
| #ifdef ENVOY_ENABLE_QUIC |
There was a problem hiding this comment.
I also think we should have one for client / upstream protocols but I can take that on, since I didn't add the correct utility over in https://github.com/envoyproxy/envoy/pull/15853/files
| transport_socket_config.mutable_downstream_tls_context()->mutable_common_tls_context(); | ||
| configInlinedCerts(common_tls_context); | ||
| transport_socket->set_name("envoy.transport_sockets.quic"); | ||
| transport_socket->mutable_typed_config()->PackFrom(transport_socket_config); |
There was a problem hiding this comment.
I feel like we're getting the quic vs tls stuff in a lot of places and I'd like to reduce that. Can you think of something cleaner? Passing in the context to a lambda and having it fill in the right one? Setting the h2 tls context by default and having a convertIfHttp3 function which moved it over and set the correct transport socket name?
There was a problem hiding this comment.
I added a utility function in ConfigHelper to distinguish ssl v.s quic.
There was a problem hiding this comment.
Should we use the new test helper here?
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
Signed-off-by: Dan Zhang <danzh@google.com>
alyssawilk
left a comment
There was a problem hiding this comment.
LGTM modulo that one typo. Passing to Snow for second pass
test/config/utility.cc
Outdated
| } | ||
| } | ||
|
|
||
| void ConfigHelper::configDownstreamTransportSocketWitTls( |
Signed-off-by: Dan Zhang <danzh@google.com>
snowp
left a comment
There was a problem hiding this comment.
Overall looks reasonable to me, just a few suggestions/questions
| const Ssl::ClientContextConfig& clientContextConfig() const { return *config_; } | ||
| const Ssl::ClientContextConfig& clientContextConfig() const { return config_; } |
There was a problem hiding this comment.
I think an alternative way to do this would be to have this getter do
ASSERT(dynamic_cast<...> != nullptr);
return static_cast<...>(contextConfig());
That way we avoid storing a separate ref on the derived classes while still avoiding a dynamic cast in prod builds?
| QuicTransportSocketFactoryStats stats_; | ||
|
|
||
| private: | ||
| std::unique_ptr<Ssl::ContextConfig> context_config_; |
There was a problem hiding this comment.
It shouldn't be const. setSecretUpdateCallback() is non-const method. But all these stuff has been moved to subclasses.
There was a problem hiding this comment.
I was thinking more about const unique_ptr, not unique_ptr<const ...>, since the pointer doesn't change
test/config/utility.cc
Outdated
| transport_socket->set_name("envoy.transport_sockets.quic"); | ||
| envoy::extensions::transport_sockets::quic::v3::QuicDownstreamTransport | ||
| quic_transport_socket_config; | ||
| fill_tls_stuff(*quic_transport_socket_config.mutable_downstream_tls_context() |
There was a problem hiding this comment.
maybe make this name a bit more explicit? configureTlsContext?
| public: | ||
| SdsDynamicIntegrationBaseTest() | ||
| : HttpIntegrationTest(Http::CodecClient::Type::HTTP1, ipVersion()), | ||
| : HttpIntegrationTest(Http::CodecClient::Type::HTTP1, std::get<0>(GetParam())), |
There was a problem hiding this comment.
seems easier to read when this was ipVersion()?
There was a problem hiding this comment.
ipVersion() is a base class member. We haven't construct the base class here yet. Switched to use TestParams.ip_version for readability.
| : public Grpc::BaseGrpcClientIntegrationParamTest, | ||
| public HttpIntegrationTest, | ||
| public testing::TestWithParam< | ||
| std::tuple<Network::Address::IpVersion, Grpc::ClientType, bool>> { |
There was a problem hiding this comment.
Maybe document what the gRPC test parameter is used for? Also what the bool refer to
There was a problem hiding this comment.
I change the tuple to a struct. Hope it helps readability.
| transport_socket_config.mutable_downstream_tls_context()->mutable_common_tls_context(); | ||
| configInlinedCerts(common_tls_context); | ||
| transport_socket->set_name("envoy.transport_sockets.quic"); | ||
| transport_socket->mutable_typed_config()->PackFrom(transport_socket_config); |
There was a problem hiding this comment.
Should we use the new test helper here?
Signed-off-by: Dan Zhang <danzh@google.com>
| stats_.context_config_update_by_sds_.inc(); | ||
| : stats_(generateStats(store, perspective)) { | ||
| context_config.setSecretUpdateCallback([this]() { | ||
| // The callback also triggers updating config_ with the new secret. |
| // TODO(danzh) Client transport socket factory may also need to update quic crypto. | ||
| stats_.context_config_update_by_sds_.inc(); | ||
| : stats_(generateStats(store, perspective)) { | ||
| context_config.setSecretUpdateCallback([this]() { |
There was a problem hiding this comment.
will this ever be immediately triggered? we'd call a virtual function in the ctor in that case
There was a problem hiding this comment.
Yes, if triggerred in the callstack, it will call base class version of the virtual function which makes sense as the derived class hasn't use context_config to setup anything yet. Is this not desired in Envoy?
There was a problem hiding this comment.
https://google.github.io/styleguide/cppguide.html#Doing_Work_in_Constructors
It tends to lead to surprising behavior and is often not resilient to refactors, so I would avoid it if possible
There was a problem hiding this comment.
Got it! I added initialize() to add the callback after construction.
| std::get<2>(p.param) ? "UsesQuic" : "UsesTcp"); | ||
| struct TestParams { | ||
| Network::Address::IpVersion ip_version; | ||
| Grpc::ClientType grpc_type; |
There was a problem hiding this comment.
sds_grpc_type? to make it clear that this relates to the SDS client, not the downstream client of the integration test
Signed-off-by: Dan Zhang <danzh@google.com>
|
/retest |
|
Retrying Azure Pipelines: |
|
Ping @snowp |
Signed-off-by: Dan Zhang <danzh@google.com>
snowp
left a comment
There was a problem hiding this comment.
Thanks this is looking pretty good! A few more comments
| if (!config_->isReady()) { | ||
| ENVOY_LOG(warn, "SDS hasn't finished updating Ssl context config yet."); | ||
| stats_.downstream_context_secrets_not_ready_.inc(); | ||
| return {}; | ||
| } | ||
| return config_->tlsCertificates(); |
There was a problem hiding this comment.
For my understanding when would this happen? Do we not block the usage of this on the SDS config being ready?
There was a problem hiding this comment.
I think we do block. But I saw similar logic in SSL socket to create a dummy socket so might as well added this.
| @@ -1,5 +1,6 @@ | |||
| #include <memory> | |||
| #include <string> | |||
| #include <type_traits> | |||
There was a problem hiding this comment.
I don't see anything from type_traits being introduced in this PR, can this be removed?
| common_tls_context.add_alpn_protocols(downstream_protocol_ == Http::CodecClient::Type::HTTP1 | ||
| ? Http::Utility::AlpnNames::get().Http11 | ||
| : Http::Utility::AlpnNames::get().Http3); |
There was a problem hiding this comment.
Should this use test_quic instead? I think this logic just matches the logic above that uses test_quic? Not clear without that why we do this remapping
There was a problem hiding this comment.
Yes, test_quic_ and downstream_protocol_ are consistent. Switched to use test_quic_ as other places in this test do.
| ? "quic_server_transport_socket_factory.context_config_update_by_sds" | ||
| : "server_ssl_socket_factory.ssl_context_update_by_sds"), |
There was a problem hiding this comment.
This makes me wonder whether the stats name should match? ie make the stat introduced in this PR ssl_context_update_by_sds? Or does that not fit well with the quic naming of things?
There was a problem hiding this comment.
ssl_context is a SslSocket thing. Quic transport socket factory uses its config directly instead of generating an ssl_context object. So I think use ssl_context in the stats name is confusing.
| client_ssl_ctx_ = createClientSslTransportSocketFactory({}, context_manager_, *api_); | ||
| } | ||
|
|
||
| void configInlinedCerts( |
|
|
||
| void configInlinedCerts( | ||
| envoy::extensions::transport_sockets::tls::v3::CommonTlsContext* common_tls_context) { | ||
| common_tls_context->add_alpn_protocols(Http::Utility::AlpnNames::get().Http11); |
There was a problem hiding this comment.
Maybe pull this out or rename function to make it clear that we're also setting the ALPN?
Signed-off-by: Dan Zhang <danzh@google.com>
snowp
left a comment
There was a problem hiding this comment.
LGTM! I see @antoniovicente assigned himself so I'll let him take a look as well
I mostly added myself as a reviewer due to Alyssa being OOO this week. I did a quick pass and think that the changes in this PR look good. |
Signed-off-by: Dan Zhang <danzh@google.com> Signed-off-by: Gokul Nair <gnair@twitter.com>
Commit Message: Update quic server transport socket factory upon SDS updates. Add stats for SDS updates. Add checking of nullptr
Additional Description: fix UDP listener to register xDS with server init_manager.
Risk Level: low
Testing: enable downstream HTTP3 in sds_dynamic_integration_test.cc
Fixes #15034
Part of #12930 #2557