Skip to content

stats: add integration test to flag missing tag extraction rules#21397

Merged
ggreenway merged 16 commits intoenvoyproxy:mainfrom
ggreenway:test-missing-tag-extraction
Jun 6, 2022
Merged

stats: add integration test to flag missing tag extraction rules#21397
ggreenway merged 16 commits intoenvoyproxy:mainfrom
ggreenway:test-missing-tag-extraction

Conversation

@ggreenway
Copy link
Copy Markdown
Member

@ggreenway ggreenway commented May 20, 2022

Signed-off-by: Greg Greenway ggreenway@apple.com

Commit Message: This is a rough heuristic, but may catch many cases of missing rules.
Additional Description:
Risk Level: low: test only
Testing: this is a new test
Docs Changes: not needed
Release Notes: not needed
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

This is a rough heuristic, but may catch many cases of missing rules.

Signed-off-by: Greg Greenway <ggreenway@apple.com>
@repokitteh-read-only
Copy link
Copy Markdown

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #21397 was opened by ggreenway.

see: more, trace.

@ggreenway
Copy link
Copy Markdown
Member Author

@jmarantz after seeing many new stats added without correct tag extraction rules, I wrote this.

Pros:

  • Gives some awareness to developers if they've missed this, for the most obvious cases

Cons:

  • Adds some code to every single integration test. Will this cause a measurable degradation in build/test time?
  • Misses anything that doesn't use stat_prefix in the config
  • Misses anything that isn't configured in an integration test

What do you think? Is there a better way to do this, that is likely to be hit for new code with a developer who doesn't know about the tag extraction rules?

@ggreenway ggreenway changed the title stats: add integration test to flag missing tag extraction rules RFC: stats: add integration test to flag missing tag extraction rules May 20, 2022
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code seems fine ; I wasn't aware of this stat_prefix convention (or had forgotten) and so I think I'm not understanding the premise of what you are checking.

std::vector<std::string> stat_prefixes;
Json::ObjectCallback find_stat_prefix = [&](const std::string& name,
const Json::Object& root) -> bool {
if (name == "stat_prefix") {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this magic "stat_prefix" come from? Could this be a string-constant?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stat_prefix is the conventional name in the config protos.

$ grep -R stat_prefix api/envoy/*
api/envoy/api/v2/core/grpc_service.proto:    string stat_prefix = 4 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/listener/v3/listener.proto:  // `listener.<stat_prefix>.`.
api/envoy/config/listener/v3/listener.proto:  string stat_prefix = 28;
api/envoy/config/core/v3/grpc_service.proto:    string stat_prefix = 4 [(validate.rules).string = {min_len: 1}];
api/envoy/config/filter/udp/udp_proxy/v2alpha/udp_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/rbac/v2/rbac.proto:  string stat_prefix = 3 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/tcp_proxy/v2/tcp_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/mongo_proxy/v2/mongo_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/zookeeper_proxy/v1alpha1/zookeeper_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/redis_proxy/v2/redis_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/http_connection_manager/v2/http_connection_manager.proto:  string stat_prefix = 2 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/dubbo_proxy/v2alpha1/dubbo_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/ext_authz/v2/ext_authz.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/local_rate_limit/v2alpha/local_rate_limit.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/kafka_broker/v2alpha1/kafka_broker.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/rate_limit/v2/rate_limit.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/client_ssl_auth/v2/client_ssl_auth.proto:  string stat_prefix = 2 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/mysql_proxy/v1alpha1/mysql_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/config/filter/network/thrift_proxy/v2alpha1/thrift_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_bytes: 1}];
api/envoy/extensions/matching/input_matchers/ip/v3/ip.proto:  string stat_prefix = 2 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/udp/dns_filter/v3/dns_filter.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/udp/udp_proxy/v3/udp_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/rbac/v3/rbac.proto:  string shadow_rules_stat_prefix = 5;
api/envoy/extensions/filters/network/rbac/v3/rbac.proto:  string stat_prefix = 3 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/local_ratelimit/v3/local_rate_limit.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/connection_limit/v3/connection_limit.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/tcp_proxy/v3/tcp_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/mongo_proxy/v3/mongo_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/zookeeper_proxy/v3/zookeeper_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/redis_proxy/v3/redis_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto:  string stat_prefix = 2 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/dubbo_proxy/v3/dubbo_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/ext_authz/v3/ext_authz.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/ratelimit/v3/rate_limit.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/client_ssl_auth/v3/client_ssl_auth.proto:  string stat_prefix = 2 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/meta_protocol_proxy/v3/meta_protocol_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/network/thrift_proxy/v3/thrift_proxy.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/http/rbac/v3/rbac.proto:  string shadow_rules_stat_prefix = 3;
api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/http/compressor/v3/compressor.proto:  //    `<stat_prefix>.compressor.<compressor_library.name>.<compressor_library_stat_prefix>.response.*`
api/envoy/extensions/filters/http/compressor/v3/compressor.proto:  //    `<stat_prefix>.compressor.<compressor_library.name>.<compressor_library_stat_prefix>.*`.
api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto:  //         stat_prefix: waf # This emits ext_authz.waf.ok, ext_authz.waf.denied, etc.
api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto:  //         stat_prefix: blocker # This emits ext_authz.blocker.ok, ext_authz.blocker.denied, etc.
api/envoy/extensions/filters/http/ext_authz/v3/ext_authz.proto:  string stat_prefix = 13;
api/envoy/extensions/filters/http/bandwidth_limit/v3/bandwidth_limit.proto:  string stat_prefix = 1 [(validate.rules).string = {min_len: 1}];
api/envoy/extensions/filters/http/ext_proc/v3/ext_proc.proto:  string stat_prefix = 8;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment here that this is based on precedent but isn't guaranteed to be exhaustive? We might need to expand this in the future?

}
}
};
check_metric_array(test_server_->counters());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; could use test_server_->forEachCounter here and void making the intermediate array. Not sure if that would be material.

for (const auto& metric : vec) {
const std::string tag_extracted_name = metric->tagExtractedName();
for (const std::string& stat_prefix : stat_prefixes) {
EXPECT_EQ(tag_extracted_name.find(stat_prefix), std::string::npos)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how just doing a total substring search for the prefixes in the tag-extracted name tells you whether the tag-extraction regexes are complete.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that if the stat_prefix is in the tag-extracted name, then it wasn't extracted, which I believe is always an error (although there are many such cases in the codebase right now).

Or at least that's how I use stat_prefix. Maybe others use it completely differently, and this isn't correct. I'd like to hear opinions on this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this assumes that the configured stat_prefix in the integration tests is a value that is unique enough that it isn't a substring of any other part of the stat name.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another approach to this test, that may be more robust, is to instead look at only the extracted tags, and verify that the value configured for stat_prefix is in one of the extracted tags, and ignore the tagExtractedName. That may more clearly express what I'm trying to accomplish.

ggreenway added 3 commits May 24, 2022 11:55
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
@ggreenway ggreenway changed the title RFC: stats: add integration test to flag missing tag extraction rules stats: add integration test to flag missing tag extraction rules May 26, 2022
@ggreenway ggreenway marked this pull request as ready for review May 26, 2022 17:36
ggreenway added 2 commits May 26, 2022 15:29
Signed-off-by: Greg Greenway <ggreenway@apple.com>
@ggreenway ggreenway requested a review from lizan as a code owner May 27, 2022 00:14
@jmarantz
Copy link
Copy Markdown
Contributor

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #21397 (comment) was created by @jmarantz.

see: more, trace.

@jmarantz
Copy link
Copy Markdown
Contributor

Greg should I review now or wait till CI is clean?

@jmarantz
Copy link
Copy Markdown
Contributor

/wait

ggreenway added 5 commits May 31, 2022 11:48
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
jmarantz
jmarantz previously approved these changes Jun 1, 2022
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm basically modulo minor nits.

Does it make sense to just get one other maintainer's opinion on this plan generally? I'll leave it up to you on all counts.

AccessLogIntegrationTest() : HttpIntegrationTest(Http::CodecType::HTTP1, ipVersion()) {}
AccessLogIntegrationTest() : HttpIntegrationTest(Http::CodecType::HTTP1, ipVersion()) {
// TODO(ggreenway): add tag extraction rules.
// Missing stat tag-extraction rule for stat 'grpc.accesslog.streams_closed_1' and stat_prefix
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that the right stat-name or is this copy/pasted from another file?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is correct. A bunch of tests with outgoing grpc sideband connections have stats in the grpc. namespace.

auto check_metric = [&](auto& metric) {
// Validate that the `stat_prefix` string doesn't appear in the tag-extracted name, indicating
// that it wasn't extracted.
const std::string tag_extracted_name = metric.tagExtractedName();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you didn't want to iterate over the tags instead? Maybe explain why in a code comment?

Signed-off-by: Greg Greenway <ggreenway@apple.com>
@ggreenway
Copy link
Copy Markdown
Member Author

Does it make sense to just get one other maintainer's opinion on this plan generally? I'll leave it up to you on all counts.

Yes, I think so. @mattklein123 what do you think of this approach?

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is awesome. I left some small comments. Can you also open a tracking beginner issue for fixing the missing extractions? Seems like a great one for new contributors.

/wait

return makeSslClientConnection({});
};

// Disable for this test because it messes up the connection IDs.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand? Unclear how. Same elsewhere.

std::vector<std::string> stat_prefixes;
Json::ObjectCallback find_stat_prefix = [&](const std::string& name,
const Json::Object& root) -> bool {
if (name == "stat_prefix") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment here that this is based on precedent but isn't guaranteed to be exhaustive? We might need to expand this in the future?

ggreenway added 4 commits June 6, 2022 09:58
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
@ggreenway
Copy link
Copy Markdown
Member Author

Filed #21595 to track the remediation

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@ggreenway ggreenway merged commit ccf1f9c into envoyproxy:main Jun 6, 2022
tyxia pushed a commit to tyxia/envoy that referenced this pull request Jun 14, 2022
…oyproxy#21397)

This is a rough heuristic, but may catch many cases of missing rules.

Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Tianyu Xia <tyxia@google.com>
Amila-Rukshan pushed a commit to Amila-Rukshan/envoy that referenced this pull request Jun 28, 2022
…oyproxy#21397)

This is a rough heuristic, but may catch many cases of missing rules.

Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Amila Senadheera <amila.15@cse.mrt.ac.lk>
@alyssawilk
Copy link
Copy Markdown
Contributor

alyssawilk commented Nov 3, 2022

hey @ggreenway @jmarantz can we have this off by default or maybe only on for one CI? It makes it significantly more complicated to debug integration tests when there's "spurious" e2e requests hitting the config dump URL

@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Nov 4, 2022

I'm not clear on why this would cause debug complexity but running it in only one CI seems OK to me.

@alyssawilk
Copy link
Copy Markdown
Contributor

as the person who debugs most of our integration test flakes, it means for every test there's a ton of spurious data plane work going on which makes it significantly harder to debug what's actually broken. I've noticed this a number of times helping people who got confused on test debug and was inspired to root case recently.

@jmarantz
Copy link
Copy Markdown
Contributor

jmarantz commented Nov 7, 2022

this makes sense. Thanks Alyssa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants