Skip to content

metric service: add support for sending tags as labels#16125

Merged
htuch merged 4 commits intoenvoyproxy:mainfrom
snowp:metrics-service-tag-support
Apr 26, 2021
Merged

metric service: add support for sending tags as labels#16125
htuch merged 4 commits intoenvoyproxy:mainfrom
snowp:metrics-service-tag-support

Conversation

@snowp
Copy link
Copy Markdown
Contributor

@snowp snowp commented Apr 22, 2021

Adds a new configuration flag that makes the metrics service use
Labels to express tags instead of sending the full stats name
that might include embedded tag keys/values.

When configured, tags will be sent as labels while the reported
name will be the tag extracted name instead of the full name.

Signed-off-by: Snow Pettersen snowp@lyft.com

Risk Level: Low, new configuration flag
Testing: UTs
Docs Changes: Inline proto docs
Release Notes: Added

Adds a new configuration flag that makes the metrics service use
Labels to express tags instead of sending the full stats name
that might include embedded tag keys/values.

When configured, tags will be sent as labels while the reported
name will be the tag extracted name instead of the full name.

Signed-off-by: Snow Pettersen <snowp@lyft.com>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
envoyproxy/api-shepherds assignee is @lizan
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #16125 was opened by snowp.

see: more, trace.

Signed-off-by: Snow Pettersen <snowp@lyft.com>
@snowp
Copy link
Copy Markdown
Contributor Author

snowp commented Apr 23, 2021

/retest

@repokitteh-read-only
Copy link
Copy Markdown

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #16125 (comment) was created by @snowp.

see: more, trace.

Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great generally with the perf concern noted.

// sink will take updates from the :ref:`MetricsResponse <envoy_api_msg_service.metrics.v3.StreamMetricsResponse>`.
google.protobuf.BoolValue report_counters_as_deltas = 2;

// If true, metrics will have its tags emitted as labels on the metrics objects sent to the MetricsService,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/its/their/ ?


if (emit_labels_) {
metrics_family.set_name(metric.tagExtractedName());
for (const auto& tag : metric.tags()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. I'm wondering if this is going to be a perf issue as constructing the tag array is going to take symbol table locks and do some string-building, with those strings then being copied again below.

It might be slightly better to use metric::iterateTagStatNames but we'd still need to convert to strings here. Maybe this transformation should be cached somehow in the MetricsFlusher object?

Maybe leave a note here to take a look at the perf implications of this later? The reason I'm concerned about this is that it looks like this is called on every flush, each with its own value, but it's re-creating a bunch of metadata for each stat that doesn't change. It'd be nice to factor that repeated computation out.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah let me leave a note here to look into the perf of this.

I think there are two options that could be explored: 1) cache just the symbol -> string conversion just to avoid the locks on the symbol table or 2) cache the entire prom Metric per stat. I'm not sure how much 2) gains us over 1) since we'll need to copy the proto completely anyways, so it might just result in a lot of wasted space keeping all these protos stored in memory? At least with 1) we get the benefit of space savings if a symbol is reused between stats. In any case I will leave this as a TODO for now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. One minor concern to note also is that if we add a cache, it could retain storage for stats that belong to deleted scopes, thus leaking. We can fix all that but it's definitely a separate PR.

EXPECT_EQ(1, (*metrics)[2].metric(0).label().size());

EXPECT_EQ("tag-histogram-name", (*metrics)[3].name());
EXPECT_EQ(1, (*metrics)[3].metric(0).label().size());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check the labels are the ones we expect also?

Snow Pettersen added 2 commits April 23, 2021 20:38
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Copy link
Copy Markdown
Contributor

@jmarantz jmarantz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@snowp
Copy link
Copy Markdown
Contributor Author

snowp commented Apr 26, 2021

@envoyproxy/api-shepherds giving this a look? Should be a simple API change

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

@htuch htuch merged commit 286ff81 into envoyproxy:main Apr 26, 2021
gokulnair pushed a commit to gokulnair/envoy that referenced this pull request May 6, 2021
)

Adds a new configuration flag that makes the metrics service use
Labels to express tags instead of sending the full stats name
that might include embedded tag keys/values.

When configured, tags will be sent as labels while the reported
name will be the tag extracted name instead of the full name.

Risk Level: Low, new configuration flag
Testing: UTs
Docs Changes: Inline proto docs
Release Notes: Added

Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Gokul Nair <gnair@twitter.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants