Skip to content

SDS onConfigUpdate ignores resource version #7676

@mosesn

Description

@mosesn

Description:
I’m setting up SDS over REST-JSON and we’re having trouble getting envoy to acknowledge that a cert has changed when we send back a message to try to indicate it should re-read the certs. We're sending a Secret-type resource that we're configuring like this:

    Secret.newBuilder()
      .setName(resourceName)
      .setTlsCertificate(TlsCertificate.newBuilder()
        .setCertificateChain(DataSource.newBuilder().setFilename(certPath))
        .setPrivateKey(DataSource.newBuilder().setFilename(privateKeyPath))
    ).build()

And since the certPath and privateKeyPath don't change (only the contents have changed), the resource is the same as before.

I noticed that I’m not seeing this log message: "ENVOY_LOG(debug, "Secret is updated.");"
from https://github.com/envoyproxy/envoy/blob/v1.10.0/source/extensions/transport_sockets/tls/ssl_socket.cc#L495 when it replies, even though I'm running at debug log level.

I would expect the /certs endpoint to reflect that the cert has changed, and I would expect it to use the new cert when trying to talk to the upstream (I haven't checked whether it also behaves this way for the downstream). Neither of those seems to be happening.

We're using envoy v1.10.0

Repro steps:
Spin up an envoy service with a config file pointing to an SDS service speaking REST-JSON. Change the cert and key that it's using for an upstream, and reply with a Secret that tells it to inspect the same path it was using before.

Admin and Stats Output:

$ curl -s localhost:9900/stats | grep sds
cluster.$CLUSTER_NAME.client_ssl_socket_factory.ssl_context_update_by_sds: 1
listener.0.0.0.0_$PORT.server_ssl_socket_factory.ssl_context_update_by_sds: 1

Config:

          "tls_certificate_sds_secret_configs": [{
            "name": "$RESOURCE_NAME",
            "sds_config": {
              "api_config_source": {
                "api_type": "REST",
                "cluster_names": ["$CLUSTER_NAME"],
                "refresh_delay": "0.001s",
                "request_timeout": "86400s"
              }
            }
          }]

Logs:

[2019-07-22 15:12:28.161][162526][debug][router] [external/envoy/source/common/router/router.cc:717] [C0][S17316378957337769216] upstream headers complete: end_stream=false
[2019-07-22 15:12:28.161][162526][debug][http] [external/envoy/source/common/http/async_client_impl.cc:94] async http request response headers (end_stream=false):
':status', '200'
'content-length', '454'
'x-envoy-upstream-service-time', '352260'

[2019-07-22 15:12:28.161][162526][debug][client] [external/envoy/source/common/http/codec_client.cc:95] [C1] response complete
[2019-07-22 15:12:28.161][162526][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:202] [C1] response complete
[2019-07-22 15:12:28.161][162526][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:240] [C1] moving to ready
[2019-07-22 15:12:28.163][162526][debug][config] [bazel-out/k8-fastbuild/bin/external/envoy/source/common/config/_virtual_includes/http_subscription_lib/common/config/http_subscription_impl.h:78] Sending REST request for /v2/discovery:secrets
[2019-07-22 15:12:28.163][162526][debug][router] [external/envoy/source/common/router/router.cc:320] [C0][S16990936995685964819] cluster 'zerofx_helper' match for URL '/v2/discovery:secrets'
[2019-07-22 15:12:28.163][162526][debug][router] [external/envoy/source/common/router/router.cc:381] [C0][S16990936995685964819] router decoding headers:
':method', 'POST'
':path', '/v2/discovery:secrets'
':authority', 'zerofx_helper'
':scheme', 'http'
'content-type', 'application/json'
'content-length', '198'
'x-envoy-internal', 'true'
'x-forwarded-for', '10.70.198.116'
'x-envoy-expected-rq-timeout-ms', '86400000'

[2019-07-22 15:12:28.163][162526][debug][pool] [external/envoy/source/common/http/http1/conn_pool.cc:97] [C1] using existing connection
[2019-07-22 15:12:28.163][162526][debug][router] [external/envoy/source/common/router/router.cc:1165] [C0][S16990936995685964819] pool ready
[2019-07-22 15:12:29.989][162526][debug][client] [external/envoy/source/common/http/codec_client.cc:95] [C7] response complete
[2019-07-22 15:12:29.989][162526][debug][hc] [external/envoy/source/common/upstream/health_checker_impl.cc:226] [C7] hc response=200 health_flags=healthy
[2019-07-22 15:12:29.989][162526][debug][http2] [external/envoy/source/common/http/http2/codec_impl.cc:577] [C7] stream closed: 0
[2019-07-22 15:12:29.992][162526][debug][client] [external/envoy/source/common/http/codec_client.cc:95] [C8] response complete
[2019-07-22 15:12:29.992][162526][debug][hc] [external/envoy/source/common/upstream/health_checker_impl.cc:226] [C8] hc response=200 health_flags=healthy
[2019-07-22 15:12:29.992][162526][debug][http2] [external/envoy/source/common/http/http2/codec_impl.cc:577] [C8] stream closed: 0
[2019-07-22 15:12:29.992][162526][debug][client] [external/envoy/source/common/http/codec_client.cc:95] [C10] response complete
[2019-07-22 15:12:29.992][162526][debug][hc] [external/envoy/source/common/upstream/health_checker_impl.cc:226] [C10] hc response=200 health_flags=healthy

Edit: I tried building it in a non-long-polled fashion and it still exhibits the error, so I'm removing all mentions of long-polling from my description. In the config where it says the refresh delay is 1ms, I also tried with a refresh delay of 5s when I'm doing short-polling and it exhibits the same behavior.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions