stats: add integration test to flag missing tag extraction rules#21397
stats: add integration test to flag missing tag extraction rules#21397ggreenway merged 16 commits intoenvoyproxy:mainfrom
Conversation
This is a rough heuristic, but may catch many cases of missing rules. Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
@jmarantz after seeing many new stats added without correct tag extraction rules, I wrote this. Pros:
Cons:
What do you think? Is there a better way to do this, that is likely to be hit for new code with a developer who doesn't know about the tag extraction rules? |
jmarantz
left a comment
There was a problem hiding this comment.
code seems fine ; I wasn't aware of this stat_prefix convention (or had forgotten) and so I think I'm not understanding the premise of what you are checking.
| std::vector<std::string> stat_prefixes; | ||
| Json::ObjectCallback find_stat_prefix = [&](const std::string& name, | ||
| const Json::Object& root) -> bool { | ||
| if (name == "stat_prefix") { |
There was a problem hiding this comment.
where does this magic "stat_prefix" come from? Could this be a string-constant?
There was a problem hiding this comment.
stat_prefix is the conventional name in the config protos.
$ grep -R stat_prefix api/envoy/*
api/envoy/api/v2/core/grpc_service.proto: string stat_prefix = 4 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/listener/v3/listener.proto: // `listener.<stat_prefix>.`.
api/envoy/config/listener/v3/listener.proto: string stat_prefix = 28;
api/envoy/config/core/v3/grpc_service.proto: string stat_prefix = 4 [(validate.rules).string = {min_len: 1}];
api/envoy/config/filter/udp/udp_proxy/v2alpha/udp_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/rbac/v2/rbac.proto: string stat_prefix = 3 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/tcp_proxy/v2/tcp_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/mongo_proxy/v2/mongo_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/zookeeper_proxy/v1alpha1/zookeeper_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/redis_proxy/v2/redis_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/http_connection_manager/v2/http_connection_manager.proto: string stat_prefix = 2 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/dubbo_proxy/v2alpha1/dubbo_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/ext_authz/v2/ext_authz.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/local_rate_limit/v2alpha/local_rate_limit.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/kafka_broker/v2alpha1/kafka_broker.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/rate_limit/v2/rate_limit.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/client_ssl_auth/v2/client_ssl_auth.proto: string stat_prefix = 2 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/mysql_proxy/v1alpha1/mysql_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/thrift_proxy/v2alpha1/thrift_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/extensions/matching/input_matchers/ip/v3/ip.proto: string stat_prefix = 2 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/udp/dns_filter/v3/dns_filter.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/udp/udp_proxy/v3/udp_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/rbac/v3/rbac.proto: string shadow_rules_stat_prefix = 5;
api/envoy/extensions/filters/network/rbac/v3/rbac.proto: string stat_prefix = 3 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/local_ratelimit/v3/local_rate_limit.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/connection_limit/v3/connection_limit.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/tcp_proxy/v3/tcp_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/mongo_proxy/v3/mongo_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/zookeeper_proxy/v3/zookeeper_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/redis_proxy/v3/redis_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto: string stat_prefix = 2 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/dubbo_proxy/v3/dubbo_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/ext_authz/v3/ext_authz.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/ratelimit/v3/rate_limit.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/client_ssl_auth/v3/client_ssl_auth.proto: string stat_prefix = 2 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/meta_protocol_proxy/v3/meta_protocol_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/thrift_proxy/v3/thrift_proxy.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/http/rbac/v3/rbac.proto: string shadow_rules_stat_prefix = 3;
api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/http/compressor/v3/compressor.proto: // `<stat_prefix>.compressor.<compressor_library.name>.<compressor_library_stat_prefix>.response.*`
api/envoy/extensions/filters/http/compressor/v3/compressor.proto: // `<stat_prefix>.compressor.<compressor_library.name>.<compressor_library_stat_prefix>.*`.
api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto: // stat_prefix: waf # This emits ext_authz.waf.ok, ext_authz.waf.denied, etc.
api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto: // stat_prefix: blocker # This emits ext_authz.blocker.ok, ext_authz.blocker.denied, etc.
api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto: string stat_prefix = 13;
api/envoy/extensions/filters/http/bandwidth_limit/v3/bandwidth_limit.proto: string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto: string stat_prefix = 8;
There was a problem hiding this comment.
Maybe add a comment here that this is based on precedent but isn't guaranteed to be exhaustive? We might need to expand this in the future?
| } | ||
| } | ||
| }; | ||
| check_metric_array(test_server_->counters()); |
There was a problem hiding this comment.
nit; could use test_server_->forEachCounter here and void making the intermediate array. Not sure if that would be material.
| for (const auto& metric : vec) { | ||
| const std::string tag_extracted_name = metric->tagExtractedName(); | ||
| for (const std::string& stat_prefix : stat_prefixes) { | ||
| EXPECT_EQ(tag_extracted_name.find(stat_prefix), std::string::npos) |
There was a problem hiding this comment.
I don't understand how just doing a total substring search for the prefixes in the tag-extracted name tells you whether the tag-extraction regexes are complete.
There was a problem hiding this comment.
The idea is that if the stat_prefix is in the tag-extracted name, then it wasn't extracted, which I believe is always an error (although there are many such cases in the codebase right now).
Or at least that's how I use stat_prefix. Maybe others use it completely differently, and this isn't correct. I'd like to hear opinions on this.
There was a problem hiding this comment.
Also, this assumes that the configured stat_prefix in the integration tests is a value that is unique enough that it isn't a substring of any other part of the stat name.
There was a problem hiding this comment.
Another approach to this test, that may be more robust, is to instead look at only the extracted tags, and verify that the value configured for stat_prefix is in one of the extracted tags, and ignore the tagExtractedName. That may more clearly express what I'm trying to accomplish.
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
/retest |
|
Retrying Azure Pipelines: |
|
Greg should I review now or wait till CI is clean? |
|
/wait |
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
jmarantz
left a comment
There was a problem hiding this comment.
lgtm basically modulo minor nits.
Does it make sense to just get one other maintainer's opinion on this plan generally? I'll leave it up to you on all counts.
| AccessLogIntegrationTest() : HttpIntegrationTest(Http::CodecType::HTTP1, ipVersion()) {} | ||
| AccessLogIntegrationTest() : HttpIntegrationTest(Http::CodecType::HTTP1, ipVersion()) { | ||
| // TODO(ggreenway): add tag extraction rules. | ||
| // Missing stat tag-extraction rule for stat 'grpc.accesslog.streams_closed_1' and stat_prefix |
There was a problem hiding this comment.
is that the right stat-name or is this copy/pasted from another file?
There was a problem hiding this comment.
I believe this is correct. A bunch of tests with outgoing grpc sideband connections have stats in the grpc. namespace.
| auto check_metric = [&](auto& metric) { | ||
| // Validate that the `stat_prefix` string doesn't appear in the tag-extracted name, indicating | ||
| // that it wasn't extracted. | ||
| const std::string tag_extracted_name = metric.tagExtractedName(); |
There was a problem hiding this comment.
you didn't want to iterate over the tags instead? Maybe explain why in a code comment?
Yes, I think so. @mattklein123 what do you think of this approach? |
mattklein123
left a comment
There was a problem hiding this comment.
Thanks this is awesome. I left some small comments. Can you also open a tracking beginner issue for fixing the missing extractions? Seems like a great one for new contributors.
/wait
| return makeSslClientConnection({}); | ||
| }; | ||
|
|
||
| // Disable for this test because it messes up the connection IDs. |
There was a problem hiding this comment.
Can you expand? Unclear how. Same elsewhere.
| std::vector<std::string> stat_prefixes; | ||
| Json::ObjectCallback find_stat_prefix = [&](const std::string& name, | ||
| const Json::Object& root) -> bool { | ||
| if (name == "stat_prefix") { |
There was a problem hiding this comment.
Maybe add a comment here that this is based on precedent but isn't guaranteed to be exhaustive? We might need to expand this in the future?
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
Filed #21595 to track the remediation |
…oyproxy#21397) This is a rough heuristic, but may catch many cases of missing rules. Signed-off-by: Greg Greenway <ggreenway@apple.com> Signed-off-by: Tianyu Xia <tyxia@google.com>
…oyproxy#21397) This is a rough heuristic, but may catch many cases of missing rules. Signed-off-by: Greg Greenway <ggreenway@apple.com> Signed-off-by: Amila Senadheera <amila.15@cse.mrt.ac.lk>
|
hey @ggreenway @jmarantz can we have this off by default or maybe only on for one CI? It makes it significantly more complicated to debug integration tests when there's "spurious" e2e requests hitting the config dump URL |
|
I'm not clear on why this would cause debug complexity but running it in only one CI seems OK to me. |
|
as the person who debugs most of our integration test flakes, it means for every test there's a ton of spurious data plane work going on which makes it significantly harder to debug what's actually broken. I've noticed this a number of times helping people who got confused on test debug and was inspired to root case recently. |
|
this makes sense. Thanks Alyssa. |
Signed-off-by: Greg Greenway ggreenway@apple.com
Commit Message: This is a rough heuristic, but may catch many cases of missing rules.
Additional Description:
Risk Level: low: test only
Testing: this is a new test
Docs Changes: not needed
Release Notes: not needed
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]