Skip to content

stats: add tag extraction rules for ADS disconnections#36673

Merged
ggreenway merged 9 commits intoenvoyproxy:mainfrom
grnmeira:ads-tag-extraction-21595-ii
Nov 5, 2024
Merged

stats: add tag extraction rules for ADS disconnections#36673
ggreenway merged 9 commits intoenvoyproxy:mainfrom
grnmeira:ads-tag-extraction-21595-ii

Conversation

@grnmeira
Copy link
Copy Markdown
Member

Commit Message: adds tag extraction rules for envoy.ads_cluster.streams_closed.
Additional Description: as seen on #21595
Risk Level: low
Testing: removes skip_tag_extraction_rule_check_ from ADS integration tests.
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Fixes commit #PR or SHA]
[Optional Deprecated:]
[Optional API Considerations:]

Signed-off-by: Gustavo <grnmeira@gmail.com>
Signed-off-by: Gustavo <grnmeira@gmail.com>
Signed-off-by: Gustavo <grnmeira@gmail.com>
Signed-off-by: Gustavo <grnmeira@gmail.com>
@tyxia
Copy link
Copy Markdown
Member

tyxia commented Oct 18, 2024

Maybe @adisuissa can help to review this PR, as xDS expert.

Added support for P-384 and P-521 curves for TLS server certificates.
- area: stats ads
change: |
Fixed metric for ADS disconnection counters, extracting the tag from ``grpc.ads_cluster.*`` counter stats.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this impacting prometheus in any way? (If so, please reflect this in the comment)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adisuissa it should. I was testing it now, though, and I don't see any GRPC disconnection stats. My dyanamic_resources config looks like this:

dynamic_resources:
  ads_config:
    api_type: GRPC
    grpc_services:
    - envoy_grpc:
        cluster_name: xds_cluster
  cds_config:
    ads: {}
  lds_config:
    ads: {}

Anything I could be missing to enforce those stats (if they're implemented)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was having a look into

void onRemoteClose(Grpc::Status::GrpcStatus status, const std::string& message) override {

It doesn't look like there's an implementation for that stat. Happy to implement it if you think it makes sense.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the counters you are referring to are the grpc ones, see google_grpc, and envoy_grpc (which are the cluster stats).
I'm assuming that the tag extraction should be more generic, and not only for ADS, no?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Though I can't get any stats from google_async_client_impl.cc with my configuration. Actually, I don't see any stats for GRPC at all when running the Envoy server.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adisuissa With the proper configuration now it all made sense to me. I'm quite new to Envoy so I totally missed your tip. I changed the prefix name to "google_grpc_client" so it's generic enough. Let me know what you think. And also, I've added a more detailed description of how Prometheus stats are impacted.

addRe2(PROXY_PROTOCOL_VERSION,
R"(^proxy_proto\.((?:<TAG_VALUE>\.)?versions\.v(<PROXY_PROTOCOL_VERSION>)\.)\w+$)");

// grpc.(<stat_prefix>).**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's the stat_prefix, but the cluster name, no?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just clarifying this, so that is indeed the stat prefix. This stat is only available through the Google GRPC client, which has mandatory string field for that prefix.

@adisuissa
Copy link
Copy Markdown
Contributor

/wait

Signed-off-by: Gustavo <grnmeira@gmail.com>
…ag-extraction-21595-ii

Signed-off-by: Gustavo <grnmeira@gmail.com>
Signed-off-by: Gustavo <grnmeira@gmail.com>
grnmeira and others added 2 commits October 30, 2024 14:44
Signed-off-by: Gustavo <grnmeira@gmail.com>
Signed-off-by: Gustavo Meira <grnmeira@users.noreply.github.com>
@grnmeira
Copy link
Copy Markdown
Member Author

@adisuissa please let me know if you're happy with my changes. All CI checks passing now.

@KBaichoo
Copy link
Copy Markdown
Contributor

KBaichoo commented Nov 5, 2024

friendly ping @adisuissa . PTAL

Copy link
Copy Markdown
Contributor

@adisuissa adisuissa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!
Assigning Greg for senior-maintainer review (and as an expert on tags).
/assign @ggreenway

@ggreenway ggreenway merged commit 82a4959 into envoyproxy:main Nov 5, 2024
Retr0-XD added a commit to Retr0-XD/envoy that referenced this pull request Mar 21, 2026
Fixes envoyproxy#21595.

Add a regex rule in well_known_names.cc to extract the gRPC client
stat_prefix from stats where the gRPC client is used via a nested
cluster reference (e.g. listener_manager.lds.grpc.<prefix>.streams_closed_N,
sds.client_cert.grpc.<prefix>.streams_closed_N).

The existing grpc.$.** tokenized rule already covers top-level
grpc.<prefix>.* stats (added in envoyproxy#36673). This new rule handles the
embedded case using a regex that anchors on .grpc. and .streams_closed_N.

Remove all 11 skip_tag_extraction_rule_check_ = true TODO blocks from
9 integration test files that were blocked on these missing rules.

AI Disclosure: Used GitHub Copilot for coding assistance.
Retr0-XD added a commit to Retr0-XD/envoy that referenced this pull request Mar 22, 2026
Fixes envoyproxy#21595.

Add a regex rule in well_known_names.cc to extract the gRPC client
stat_prefix from stats where the gRPC client is used via a nested
cluster reference (e.g. listener_manager.lds.grpc.<prefix>.streams_closed_N,
sds.client_cert.grpc.<prefix>.streams_closed_N).

The existing grpc.$.** tokenized rule already covers top-level
grpc.<prefix>.* stats (added in envoyproxy#36673). This new rule handles the
embedded case using a regex that anchors on .grpc. and .streams_closed_N.

Remove all 11 skip_tag_extraction_rule_check_ = true TODO blocks from
9 integration test files that were blocked on these missing rules.

AI Disclosure: Used GitHub Copilot for coding assistance.

Signed-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants