Add configurable stats tags#1852
Conversation
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
htuch
left a comment
There was a problem hiding this comment.
Looks solid, some comments and nits.
| * represented in the name, the tags vector will remain unmodified and the return value will be a | ||
| * copy of name. The portion removed from the name may be different than the value put into | ||
| * the tag vector for readability purposes. Note: The extraction process is expected to be run | ||
| * iteratively. For a list of TagExtractors, the original name is expected to be passed into |
include/envoy/stats/stats.h
Outdated
| * copy of name. The portion removed from the name may be different than the value put into | ||
| * the tag vector for readability purposes. Note: The extraction process is expected to be run | ||
| * iteratively. For a list of TagExtractors, the original name is expected to be passed into | ||
| * extractTag for the fist, and then each iteration after the modified name returned from the |
| */ | ||
| virtual void addSink(Sink& sink) PURE; | ||
|
|
||
| /** |
There was a problem hiding this comment.
A single extractor or a vector of extractors?
source/common/config/utility.h
Outdated
| /** | ||
| * Creates the set of stats tag extractors requested by the config and transfers ownership to the | ||
| * caller. | ||
| * @param bootstrap bootstrap proto |
There was a problem hiding this comment.
Nit: prefer sentences to end with .
source/common/config/utility.h
Outdated
| * Creates the set of stats tag extractors requested by the config and transfers ownership to the | ||
| * caller. | ||
| * @param bootstrap bootstrap proto | ||
| * @return Owning vector of TagExtractors |
There was a problem hiding this comment.
Nit: prefer @return std::vector<Stats::TagExtractorPtr> tag extractor vector. or something to that effect, Envoy style guide has return type after @return.
There was a problem hiding this comment.
Oh, I wasn't aware of this. Thanks for the heads up. Two questions:
- This is pretty inconsistent in Envoy now. Should we clean this up?
- We don't actually render the doxygen, do we? If we do, IIRC, the return type is provided there automatically. If we don't, we're just typing out the type that could be read a few lines below. Is there an advantage to that? And why don't we do this for @ params? I'm not necessarily suggesting we do it a different way, just curious about the reasoning as it's not immediately obvious to me.
There was a problem hiding this comment.
TBH I think this was probably a habit picked up on some project I worked on long ago which I no longer remember. It may very well make no sense. I'm not sure how easy it would be to clean up. In a perfect world we would actually run doxygen if even for the only reason to check that comments are compliant and fail if they are not. (there is an issue on this)
There was a problem hiding this comment.
Gotcha. I won't worry about it for now. I was just curious. Side note: publishing the generated doxygen would be pretty nice.
| const std::string& regex) { | ||
|
|
||
| if (name.empty()) { | ||
| throw EnvoyException("tag_name cannot be empty"); |
source/common/stats/stats_impl.cc
Outdated
| tag.name_ = name_; | ||
| tag.value_ = value_subexpr.str(); | ||
|
|
||
| // This call invalidates match and all derived objects because they contain references to |
There was a problem hiding this comment.
Not sure which call in the below expression this comment is referring to.
| std::string TagExtractorImpl::extractTag(const std::string& tag_extracted_name, | ||
| std::vector<Tag>& tags) const { | ||
| std::smatch match; | ||
| // The regex must match and contain one or more subexpressions (all after the first are ignored). |
There was a problem hiding this comment.
Maybe explain the remove_subexpr vs. value_expr distinction as well (i.e. what you capture vs. what you want to extract value wise).
| * Creates a tag extractor from the regex provided or looks up a default regex. | ||
| * @param name name for tag extractor. Used to look up a default tag extractor if regex is empty. | ||
| * @param regex optional regex expression. Can be specified as an empty string to trigger a | ||
| * default regex lookup. |
| server_stats_.reset( | ||
| new ServerStats{ALL_SERVER_STATS(POOL_GAUGE_PREFIX(stats_store_, "server."))}); | ||
|
|
||
| failHealthcheck(false); |
There was a problem hiding this comment.
This method calls server_stats_->live_.set(!fail);, so server_stats_ needs to be initialized before it's called.
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
| // *_rq(_<response_code_class>xx) | ||
| name_regex_pairs.push_back({RESPONSE_CODE_CLASS, "_rq(_(\\dxx))$"}); | ||
|
|
||
| // http.<stat_prefix>.dynamodb.table.<table_name>.capacity.<operation_name>.__partition_id=<last_seven_characters_from_partition_id> |
There was a problem hiding this comment.
Should this be [<stat_prefix>.]? Otherwise, how come the lookahead?
| // http.[<stat_prefix>.]dynamodb.operation.(<operation_name>.)<base_stat> or | ||
| // http.[<stat_prefix>.]dynamodb.table.[<table_name>.]capacity.(<operation_name>.)[<partition_id>] | ||
| name_regex_pairs.push_back({DYNAMO_OPERATION, "^http(?=\\.).*?\\.dynamodb.(?:operation|table(?=" | ||
| "\\.).*?\\.capacity)(\\.(.*?))(?:\\.|$)"}); |
There was a problem hiding this comment.
Might be clearer as two regexes..
There was a problem hiding this comment.
As it works now, we use names as unique identifiers (and we throw if the user attempts to use duplicate names), so if a tag shows up in two distinct places a single regex must capture both. This is how we ensure a default regex isn't used twice (one custom and one with the with the use_all_defaults flag for example). One alternative would be to check uniqueness of the regex strings themselves during creation instead of names. That would allow us to have duplicate names in the list of tag extractors. It wouldn't be a big change; do you think that would be preferred? My only hesitation is that it seems a little less intuitive for the user than requiring uniqueness of names.
| name_regex_pairs.push_back({RATELIMIT_PREFIX, "^ratelimit\\.((.*?)\\.)\\w+?$"}); | ||
|
|
||
| // cluster.(<cluster_name>.)* | ||
| name_regex_pairs.push_back({CLUSTER_NAME, "^cluster\\.((.*?)\\.)"}); |
There was a problem hiding this comment.
QQ: will we pick up a some reserved words in stats names as cluster names here?
There was a problem hiding this comment.
I'm not sure if I'm understanding your question correctly, but I don't think so. This regex should only be run once, so when it is run, the only string that can be following "cluster." is the actual name of the cluster. As long as no other regex removes the cluster name before this one, it should work as intended. Let me know if I'm missing something.
| regex_tester.testRegex("http.fault_connection_manager.fault.fault_cluster.aborts_injected", | ||
| "http.fault.aborts_injected", | ||
| {fault_connection_manager, fault_downstream_cluster}); | ||
| } |
There was a problem hiding this comment.
An alternative testing strategy might be to run the tag extraction after an integration test, dump the outputs, manually verify create a "golden" state and then compare. I don't think you need to do this, there are already a bunch of reasonable tests, but this kind of testing would sanity check what happens when we run on real-ish data.
There was a problem hiding this comment.
Good point. I tried this out and found a problem with my expectations (based on the docs) - "listener." is followed by <ip>_<port> rather than just port, so it wasn't being captured. Not sure the best way to do this though... read in a text file? Do we have other tests that work this way? I want to make it as low-maintenance as possible, meaning I would rather people not have to re-generate the output every time they add a stat. That sort of setup would make it easy for people to accidentally ignore an issue they created because it becomes the normal process to modify this test each time anyone touches stats.
One possible solution would be to have the "golden state" be a set of stats (and their processed forms) that are expected to exist. Stats added in the future will be ignored. This would eliminate what I would assume would be the most common stats modification - addition.
There was a problem hiding this comment.
We don't really have tests that do this today. I think maybe just adding to one of the integration tests a validation of the full extraction would be useful and give the same benefit as well.
Signed-off-by: Matt Rice <mattrice@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
Cool stuff. LGTM. Few random comments.
source/common/config/utility.cc
Outdated
|
|
||
| std::vector<Stats::TagExtractorPtr> | ||
| Utility::createTagExtractors(const envoy::api::v2::Bootstrap& bootstrap) { | ||
|
|
| }; | ||
|
|
||
| // Add defaults. | ||
| if (!bootstrap.stats_config().has_use_all_default_tags() || |
There was a problem hiding this comment.
@htuch random unrelated thought: With the proto validator code, could we potentially auto add defaults for things like this if not specified? Might make things easier to read in general.
source/common/stats/stats_impl.h
Outdated
| std::vector<Tag>& tags) const override; | ||
|
|
||
| private: | ||
| std::string name_; |
There was a problem hiding this comment.
nit: these can be const I think.
|
|
||
| private: | ||
| const std::string name_; | ||
| const std::string tag_extracted_name_; |
There was a problem hiding this comment.
nit: can we make all of these const?
There was a problem hiding this comment.
? These members are already const. Was this meant to point at something else?
There was a problem hiding this comment.
Oops sorry nevermind mistake.
source/server/server.cc
Outdated
|
|
||
| failHealthcheck(false); | ||
|
|
||
| uint64_t version_int; |
There was a problem hiding this comment.
nit: instead of passing version_int, can you just do this where we now set the stat?
Signed-off-by: Matt Rice <mattrice@google.com>
mattklein123
left a comment
There was a problem hiding this comment.
LGTM pending any final comments from @htuch
…es with regexes. Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
|
@htuch @mattklein123 fixed a few faulty assumptions revealed by dumping stats from an integration test and added a "golden" state integration test to ensure these actual envoy stats (as opposed to hardcoded strings pulled from the docs in the unit tests) don't break in the future. PTAL |
| // Note: the test below is meant to check that default tags are being extracted correctly with | ||
| // real-ish input stats. If new stats are added, this test will not break because names that do | ||
| // not exist in the map are not checked. However, if stats are modified the below maps should be | ||
| // updated (or regenerated by printing in map literal format). |
There was a problem hiding this comment.
Can you add a bit more color on how you generated this list? I'm not sure exactly what "regenerated by printing in map literal format" means. Mainly concerned that we might have to regenerate and it will be confusing.
There was a problem hiding this comment.
Sure, I can provide more details. I could also just toss the generation code in there and leave it commented out if you think that would decrease the friction?
There was a problem hiding this comment.
+1, generally not a fan of commented out code but that seems fine here.
Signed-off-by: Matt Rice <mattrice@google.com>
htuch
left a comment
There was a problem hiding this comment.
LGTM from my side, I'll leave it to @mattklein123 for final approval/merge.
| // std::cout << std::endl << "{\"" << (*it)->name() << "\", \"" << (*it)->tagExtractedName() << | ||
| // "\"}"; | ||
| // } | ||
| // std::cout << "};" << std::endl; |
There was a problem hiding this comment.
This is fine with me if OK with @mattklein123. Might also be a valid thing to do to factor this code into a function that is used to both generate the golden file and the current state for comparison, but I think you've got most of the win here by just having some real state being validated against your regexes.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
This reverts commit f85f49c.
Signed-off-by: Mike Schore <mike.schore@gmail.com> Signed-off-by: JP Simard <jp@jpsim.com>
Signed-off-by: Mike Schore <mike.schore@gmail.com> Signed-off-by: JP Simard <jp@jpsim.com>
**Description** Restores cache_creation_input_tokens field in OpenAI-format streaming responses for GCP Anthropic models. This field was accidentally omitted during a recent refactoring, causing clients to be unable to track cache write operations. Problem: The cache_creation_input_tokens field was missing from streaming responses when using GCP Anthropic with prompt caching. Internal Integration test failure: test_gcp_anthropic_cache_hit_stream[gcp.claude-3-7-sonnet] AssertionError: First stream (MISS) should have cache_creation > 0, but got 0 **Root Cause** Commit 4347d17 (Feb 7, 2026) refactored streaming logic from openai_gcpanthropic_stream.go into anthropic_helper.go. During this refactoring, the SetCacheCreationInputTokens() calls were accidentally omitted from two event handlers: 1. message_start handler - missing initial cache creation token storage 2. message_delta handler - missing incremental cache creation token accumulation The code correctly extracted the value from Anthropic's API but never stored it in the response. **Changes** internal/translator/anthropic_helper.go: - Added SetCacheCreationInputTokens() call in message_start handler - Added AddCacheCreationInputTokens() call in message_delta handler internal/translator/openai_gcpanthropic_test.go: - Updated existing test cases to verify cache_creation_input_tokens appears in responses - Added field to "basic text response" and "response with tool use" test cases --------- Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net>
This is the second stage of #1751.
The concept of a
TagExtractoris added.TagExtractors are used to remove part of the name of a stat on creation and set it as a named tag rather than the name. Regexes are used to extract the tags iteratively. There are a set of default tags included, but users can also pass in lists of their own name-regex combinations in the config (this portion is documented here).In response to comments on #1751, the default regexes (if included wholesale using the config
bool) are processed in a defined ordering from least specific (earlier in the string) to most specific (later in the string).Fixes #1536.