Skip to content

xDS: inconsistent stats when sending route before listener #11871

@samflattery

Description

@samflattery

Description
I'm developing an xDS fuzzer that tests how Envoy handles sequences of dynamic xDS updates. Since Envoy ignores any RouteConfiguration updates before any listeners are added, adding a listener after adding a route config should leave the listener in a warming state, but the stats and the state given in the config dumps are inconsistent in that the config dump says it is warming but the stats say it is active

Reproducing
The sequence of steps are:

  1. Respond to CDS and EDS request, build cluster_0
  2. Send RouteConfiguration DiscoveryResponse, build route_config_0
  3. Send Listener DiscoveryResponse, add listener_0 referencing route_config_0 above
  4. Check stats and config dumps, listener_0 warming in config dump, active in stats

I reproduced this in a testcase here: https://github.com/samflattery/envoy/blob/ddbae7f4ebdf2dfff20a997784adf7961077494b/test/integration/ads_integration_test.cc#L1132

Note
This is a similar issue to #7431

Logs

[2020-07-02 10:16:01.871][1307][info][main] [source/server/server.cc:640] starting main dispatch loop
[2020-07-02 10:16:01.901][1307][info][upstream] [source/common/upstream/cds_api_impl.cc:78] cds: add 1 cluster(s), remove 2 cluster(s)
[2020-07-02 10:16:01.944][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type type.googleapis.com/envoy.api.v2.Cluster and version 1
[2020-07-02 10:16:01.946][1307][info][upstream] [source/common/upstream/cds_api_impl.cc:94] cds: add/update cluster 'cluster_0'
[2020-07-02 10:16:01.946][1307][info][upstream] [source/common/upstream/cluster_manager_impl.cc:152] cm init: initializing secondary clusters
[2020-07-02 10:16:01.949][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type type.googleapis.com/envoy.api.v2.ClusterLoadAssignment and version 1
[2020-07-02 10:16:01.950][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type type.googleapis.com/envoy.api.v2.Cluster and version 1
[2020-07-02 10:16:01.951][1307][info][upstream] [source/common/upstream/cluster_manager_impl.cc:180] cm init: all clusters initialized
[2020-07-02 10:16:01.951][1307][info][main] [source/server/server.cc:619] all clusters initialized. initializing init manager
[2020-07-02 10:16:01.952][1307][warning][config] [source/common/config/grpc_mux_impl.cc:175] Ignoring unwatched type URL type.googleapis.com/envoy.api.v2.RouteConfiguration
[2020-07-02 10:16:01.997][1307][info][upstream] [source/server/lds_api.cc:78] lds: add/update listener 'listener_0'
[2020-07-02 10:16:01.999][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type type.googleapis.com/envoy.api.v2.Listener and version 1
[2020-07-02 10:16:02.000][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type type.googleapis.com/envoy.api.v2.ClusterLoadAssignment and version 1
[2020-07-02 10:16:02.000][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type  and version 1
[2020-07-02 10:16:02.000][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type type.googleapis.com/envoy.api.v2.RouteConfiguration and version 1
[2020-07-02 10:16:02.001][15][info][misc] [test/integration/ads_integration.cc:136] Received gRPC message with type type.googleapis.com/envoy.api.v2.Listener and version 1
test/integration/ads_integration_test.cc:1163: Failure
Expected equality of these values:
  test_server_->gauge("listener_manager.total_listeners_warming")->value()
    Which is: 0
  1
Stack trace:
  0x225af57: Envoy::AdsIntegrationTest_RouteSentBeforeListeners_Test::TestBody()
  0x50670a4: testing::internal::HandleSehExceptionsInMethodIfSupported<>()
  0x505797b: testing::internal::HandleExceptionsInMethodIfSupported<>()
  0x50453d3: testing::Test::Run()
  0x5045d97: testing::TestInfo::Run()
... Google Test internal frames ...

test/integration/ads_integration_test.cc:1164: Failure
Expected equality of these values:
  test_server_->gauge("listener_manager.total_listeners_active")->value()
    Which is: 1
  0

edit: this fails even without sending the route first and just sending the listener. It seems that when a new listener is added that should be warming, the active state is incremented instead.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions