Skip to content

Add configurable stats tags#1852

Merged
mattklein123 merged 11 commits intoenvoyproxy:masterfrom
mrice32:stats_tags_separated
Oct 26, 2017
Merged

Add configurable stats tags#1852
mattklein123 merged 11 commits intoenvoyproxy:masterfrom
mrice32:stats_tags_separated

Conversation

@mrice32
Copy link
Copy Markdown
Member

@mrice32 mrice32 commented Oct 13, 2017

This is the second stage of #1751.

The concept of a TagExtractor is added. TagExtractors are used to remove part of the name of a stat on creation and set it as a named tag rather than the name. Regexes are used to extract the tags iteratively. There are a set of default tags included, but users can also pass in lists of their own name-regex combinations in the config (this portion is documented here).

In response to comments on #1751, the default regexes (if included wholesale using the config bool) are processed in a defined ordering from least specific (earlier in the string) to most specific (later in the string).

Fixes #1536.

Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid, some comments and nits.

* represented in the name, the tags vector will remain unmodified and the return value will be a
* copy of name. The portion removed from the name may be different than the value put into
* the tag vector for readability purposes. Note: The extraction process is expected to be run
* iteratively. For a list of TagExtractors, the original name is expected to be passed into
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/fist/first/

* copy of name. The portion removed from the name may be different than the value put into
* the tag vector for readability purposes. Note: The extraction process is expected to be run
* iteratively. For a list of TagExtractors, the original name is expected to be passed into
* extractTag for the fist, and then each iteration after the modified name returned from the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence just trails off..

*/
virtual void addSink(Sink& sink) PURE;

/**
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single extractor or a vector of extractors?

/**
* Creates the set of stats tag extractors requested by the config and transfers ownership to the
* caller.
* @param bootstrap bootstrap proto
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: prefer sentences to end with .

* Creates the set of stats tag extractors requested by the config and transfers ownership to the
* caller.
* @param bootstrap bootstrap proto
* @return Owning vector of TagExtractors
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: prefer @return std::vector<Stats::TagExtractorPtr> tag extractor vector. or something to that effect, Envoy style guide has return type after @return.

Copy link
Copy Markdown
Member Author

@mrice32 mrice32 Oct 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I wasn't aware of this. Thanks for the heads up. Two questions:

  1. This is pretty inconsistent in Envoy now. Should we clean this up?
  2. We don't actually render the doxygen, do we? If we do, IIRC, the return type is provided there automatically. If we don't, we're just typing out the type that could be read a few lines below. Is there an advantage to that? And why don't we do this for @ params? I'm not necessarily suggesting we do it a different way, just curious about the reasoning as it's not immediately obvious to me.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I think this was probably a habit picked up on some project I worked on long ago which I no longer remember. It may very well make no sense. I'm not sure how easy it would be to clean up. In a perfect world we would actually run doxygen if even for the only reason to check that comments are compliant and fail if they are not. (there is an issue on this)

Copy link
Copy Markdown
Member Author

@mrice32 mrice32 Oct 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. I won't worry about it for now. I was just curious. Side note: publishing the generated doxygen would be pretty nice.

const std::string& regex) {

if (name.empty()) {
throw EnvoyException("tag_name cannot be empty");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have coverage for this?

tag.name_ = name_;
tag.value_ = value_subexpr.str();

// This call invalidates match and all derived objects because they contain references to
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure which call in the below expression this comment is referring to.

std::string TagExtractorImpl::extractTag(const std::string& tag_extracted_name,
std::vector<Tag>& tags) const {
std::smatch match;
// The regex must match and contain one or more subexpressions (all after the first are ignored).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain the remove_subexpr vs. value_expr distinction as well (i.e. what you capture vs. what you want to extract value wise).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added this.

* Creates a tag extractor from the regex provided or looks up a default regex.
* @param name name for tag extractor. Used to look up a default tag extractor if regex is empty.
* @param regex optional regex expression. Can be specified as an empty string to trigger a
* default regex lookup.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@return?

server_stats_.reset(
new ServerStats{ALL_SERVER_STATS(POOL_GAUGE_PREFIX(stats_store_, "server."))});

failHealthcheck(false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this moved?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method calls server_stats_->live_.set(!fail);, so server_stats_ needs to be initialized before it's called.

Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after comments.

// *_rq(_<response_code_class>xx)
name_regex_pairs.push_back({RESPONSE_CODE_CLASS, "_rq(_(\\dxx))$"});

// http.<stat_prefix>.dynamodb.table.<table_name>.capacity.<operation_name>.__partition_id=<last_seven_characters_from_partition_id>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be [<stat_prefix>.]? Otherwise, how come the lookahead?

// http.[<stat_prefix>.]dynamodb.operation.(<operation_name>.)<base_stat> or
// http.[<stat_prefix>.]dynamodb.table.[<table_name>.]capacity.(<operation_name>.)[<partition_id>]
name_regex_pairs.push_back({DYNAMO_OPERATION, "^http(?=\\.).*?\\.dynamodb.(?:operation|table(?="
"\\.).*?\\.capacity)(\\.(.*?))(?:\\.|$)"});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be clearer as two regexes..

Copy link
Copy Markdown
Member Author

@mrice32 mrice32 Oct 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it works now, we use names as unique identifiers (and we throw if the user attempts to use duplicate names), so if a tag shows up in two distinct places a single regex must capture both. This is how we ensure a default regex isn't used twice (one custom and one with the with the use_all_defaults flag for example). One alternative would be to check uniqueness of the regex strings themselves during creation instead of names. That would allow us to have duplicate names in the list of tag extractors. It wouldn't be a big change; do you think that would be preferred? My only hesitation is that it seems a little less intuitive for the user than requiring uniqueness of names.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, it's good as is.

name_regex_pairs.push_back({RATELIMIT_PREFIX, "^ratelimit\\.((.*?)\\.)\\w+?$"});

// cluster.(<cluster_name>.)*
name_regex_pairs.push_back({CLUSTER_NAME, "^cluster\\.((.*?)\\.)"});
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: will we pick up a some reserved words in stats names as cluster names here?

Copy link
Copy Markdown
Member Author

@mrice32 mrice32 Oct 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I'm understanding your question correctly, but I don't think so. This regex should only be run once, so when it is run, the only string that can be following "cluster." is the actual name of the cluster. As long as no other regex removes the cluster name before this one, it should work as intended. Let me know if I'm missing something.

regex_tester.testRegex("http.fault_connection_manager.fault.fault_cluster.aborts_injected",
"http.fault.aborts_injected",
{fault_connection_manager, fault_downstream_cluster});
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative testing strategy might be to run the tag extraction after an integration test, dump the outputs, manually verify create a "golden" state and then compare. I don't think you need to do this, there are already a bunch of reasonable tests, but this kind of testing would sanity check what happens when we run on real-ish data.

Copy link
Copy Markdown
Member Author

@mrice32 mrice32 Oct 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I tried this out and found a problem with my expectations (based on the docs) - "listener." is followed by <ip>_<port> rather than just port, so it wasn't being captured. Not sure the best way to do this though... read in a text file? Do we have other tests that work this way? I want to make it as low-maintenance as possible, meaning I would rather people not have to re-generate the output every time they add a stat. That sort of setup would make it easy for people to accidentally ignore an issue they created because it becomes the normal process to modify this test each time anyone touches stats.

One possible solution would be to have the "golden state" be a set of stats (and their processed forms) that are expected to exist. Stats added in the future will be ignored. This would eliminate what I would assume would be the most common stats modification - addition.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't really have tests that do this today. I think maybe just adding to one of the integration tests a validation of the full extraction would be useful and give the same benefit as well.

Signed-off-by: Matt Rice <mattrice@google.com>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff. LGTM. Few random comments.


std::vector<Stats::TagExtractorPtr>
Utility::createTagExtractors(const envoy::api::v2::Bootstrap& bootstrap) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: del newline

};

// Add defaults.
if (!bootstrap.stats_config().has_use_all_default_tags() ||
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch random unrelated thought: With the proto validator code, could we potentially auto add defaults for things like this if not specified? Might make things easier to read in general.

std::vector<Tag>& tags) const override;

private:
std::string name_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these can be const I think.


private:
const std::string name_;
const std::string tag_extracted_name_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we make all of these const?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

? These members are already const. Was this meant to point at something else?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops sorry nevermind mistake.


failHealthcheck(false);

uint64_t version_int;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of passing version_int, can you just do this where we now set the stat?

Signed-off-by: Matt Rice <mattrice@google.com>
mattklein123
mattklein123 previously approved these changes Oct 23, 2017
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending any final comments from @htuch

…es with regexes.

Signed-off-by: Matt Rice <mattrice@google.com>
Signed-off-by: Matt Rice <mattrice@google.com>
@mrice32
Copy link
Copy Markdown
Member Author

mrice32 commented Oct 25, 2017

@htuch @mattklein123 fixed a few faulty assumptions revealed by dumping stats from an integration test and added a "golden" state integration test to ensure these actual envoy stats (as opposed to hardcoded strings pulled from the docs in the unit tests) don't break in the future. PTAL

// Note: the test below is meant to check that default tags are being extracted correctly with
// real-ish input stats. If new stats are added, this test will not break because names that do
// not exist in the map are not checked. However, if stats are modified the below maps should be
// updated (or regenerated by printing in map literal format).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a bit more color on how you generated this list? I'm not sure exactly what "regenerated by printing in map literal format" means. Mainly concerned that we might have to regenerate and it will be confusing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can provide more details. I could also just toss the generation code in there and leave it commented out if you think that would decrease the friction?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, generally not a fan of commented out code but that seems fine here.

Signed-off-by: Matt Rice <mattrice@google.com>
Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from my side, I'll leave it to @mattklein123 for final approval/merge.

// std::cout << std::endl << "{\"" << (*it)->name() << "\", \"" << (*it)->tagExtractedName() <<
// "\"}";
// }
// std::cout << "};" << std::endl;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine with me if OK with @mattklein123. Might also be a valid thing to do to factor this code into a function that is used to both generate the golden file and the current state for comparison, but I think you've got most of the win here by just having some real state being validated against your regexes.

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Nice work!

@mattklein123 mattklein123 merged commit 005aee4 into envoyproxy:master Oct 26, 2017
@mrice32 mrice32 deleted the stats_tags_separated branch October 26, 2017 15:07
rshriram pushed a commit to rshriram/envoy that referenced this pull request Oct 30, 2018
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
rshriram pushed a commit to rshriram/envoy that referenced this pull request Oct 30, 2018
jpsim pushed a commit that referenced this pull request Nov 28, 2022
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: JP Simard <jp@jpsim.com>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
Signed-off-by: Mike Schore <mike.schore@gmail.com>
Signed-off-by: JP Simard <jp@jpsim.com>
mathetake pushed a commit that referenced this pull request Mar 3, 2026
**Description**

Restores cache_creation_input_tokens field in OpenAI-format streaming
responses for GCP Anthropic models. This field was accidentally omitted
during a recent refactoring, causing clients to be unable to track cache
write operations.

Problem: The cache_creation_input_tokens field was missing from
streaming responses when using GCP Anthropic with prompt caching.

Internal Integration test failure:
  test_gcp_anthropic_cache_hit_stream[gcp.claude-3-7-sonnet]
AssertionError: First stream (MISS) should have cache_creation > 0, but
got 0

**Root Cause**

Commit 4347d17 (Feb 7, 2026) refactored streaming logic from
openai_gcpanthropic_stream.go into anthropic_helper.go. During this
refactoring, the SetCacheCreationInputTokens() calls were accidentally
omitted from two event handlers:

1. message_start handler - missing initial cache creation token storage
2. message_delta handler - missing incremental cache creation token
accumulation

The code correctly extracted the value from Anthropic's API but never
stored it in the response.

**Changes**

  internal/translator/anthropic_helper.go:
  - Added SetCacheCreationInputTokens() call in message_start handler
  - Added AddCacheCreationInputTokens() call in message_delta handler

  internal/translator/openai_gcpanthropic_test.go:
- Updated existing test cases to verify cache_creation_input_tokens
appears in responses
- Added field to "basic text response" and "response with tool use" test
cases

---------

Signed-off-by: Alexa Griffith <agriffith50@bloomberg.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants