feat(metrics): Count transactions toward root project [TET-627]#1734
feat(metrics): Count transactions toward root project [TET-627]#1734
Conversation
| #[serde(flatten)] | ||
| value: BucketValue, | ||
| timestamp: UnixTimestamp, | ||
| #[serde(skip_serializing_if = "BTreeMap::is_empty")] |
There was a problem hiding this comment.
i am reviewing this change only. i think we can revisit the schema we actually want later, but as pointed out in private conversations, the metrics indexer currently does not handle the absence of tags correctly.
jjbayer
left a comment
There was a problem hiding this comment.
The PR looks good to me. Let's make sure Search and Storage is aware that we are adding a counter metric before we deploy this.
c932995 to
c727b43
Compare
63432ac to
0849c6e
Compare
0849c6e to
ddcb598
Compare
relay-server/src/actors/processor.rs
Outdated
| // requires recomputation of the context. | ||
| state.envelope_context.update(&state.envelope); | ||
|
|
||
| let has_metrics = state.extracted_metrics.project_metrics.is_empty(); |
There was a problem hiding this comment.
| let has_metrics = state.extracted_metrics.project_metrics.is_empty(); | |
| let has_metrics = !state.extracted_metrics.project_metrics.is_empty(); |
There was a problem hiding this comment.
I am surprised that no test case caught this.
|
|
||
| assert metrics.keys() == { | ||
| "d:transactions/duration@millisecond", | ||
| "c:transactions/count_per_root_project@none", |
There was a problem hiding this comment.
As far as I can tell, these test cases all add this metric to current project, because there is no DSC involved. Please also add tests for the following:
- a test with a DSC where the DSC's project ID is different from the event's project ID.
- a test where we assert that the DSC's
transactionfield is used as a tag on the metric.
CHANGELOG.md
Outdated
| - Add error and sample rate fields to the replay event parser. ([#1745](https://github.com/getsentry/relay/pull/1745)) | ||
| - Add `instruction_addr_adjustment` field to `RawStacktrace`. ([#1716](https://github.com/getsentry/relay/pull/1716)) | ||
| - Add SSL support to `relay-redis` crate. It is possible to use `rediss` scheme to connnect to Redis cluster using TLS. ([#1772](https://github.com/getsentry/relay/pull/1772)) | ||
| - Add count transactions toward root project. ([#1734](https://github.com/getsentry/relay/pull/1734)) |
There was a problem hiding this comment.
Please, move this to Unreleased section.
relay-server/src/actors/processor.rs
Outdated
| fn process_sessions(&self, state: &mut ProcessEnvelopeState) { | ||
| let received = state.envelope_context.received_at(); | ||
| let extracted_metrics = &mut state.extracted_metrics; | ||
| let extracted_metrics = &mut &mut state.extracted_metrics.project_metrics; |
There was a problem hiding this comment.
Doesn't this work just with taking one mutable reference?
| let extracted_metrics = &mut &mut state.extracted_metrics.project_metrics; | |
| let extracted_metrics = &mut state.extracted_metrics.project_metrics; |
There was a problem hiding this comment.
good catch! yeah it works without second &mut
| metrics: &mut Vec<Metric>, // output parameter | ||
| metrics: &mut Vec<Metric>, // output parameter | ||
| sampling_metrics: &mut Vec<Metric>, // output parameter | ||
| transaction_from_dsc: Option<&str>, |
There was a problem hiding this comment.
nit: Because metrics and sampling_metrics are output parameters, I would put them at the end of the argument list.
| "name": "c:transactions/count_per_root_project@none", | ||
| "type": "c", | ||
| "value": 3.0, | ||
| } |
There was a problem hiding this comment.
Isn't this use case already tested by test_transaction_metrics? Can we alter this test so that this metric is emitted with project_id: 41, where 41 is the root project. We could alter the signature of send_transaction to accept an explicit trace header:
relay.send_transaction(42, transaction, trace_header={"public_key": public_key_of_project_41, "transaction": "root_transaction"})
It would also be good to have one test case where count_per_root_project is emitted without a transaction tag (maybe I missed it).
There was a problem hiding this comment.
agree with you suggestion,
regarding:
It would also be good to have one test case where count_per_root_project is emitted without a transaction tag (maybe I missed it).
I think we cover it separately in other tests where we can see tags = {}, but I can write separate test for that.
There was a problem hiding this comment.
Thanks, no need to add a separate test if it's already covered.
tests/integration/test_metrics.py
Outdated
| return metrics | ||
|
|
||
|
|
||
| def metrics_by_name_group_by_project(metrics_consumer, count, timeout=None): |
There was a problem hiding this comment.
This helper function does not seem to be using count at all. I feel like this function could call metrics_by_name internally, or vice versa?
There was a problem hiding this comment.
count is huge antipattern IMO, I'll remove it from args :)
There was a problem hiding this comment.
let me explain, this helper has small problem:
def metrics_by_name(metrics_consumer, count, timeout=None):
metrics = {}
for _ in range(count):
metric = metrics_consumer.get_metric(timeout)
metrics[metric["name"]] = metric
metrics_consumer.assert_empty()
return metrics
if you have 2 projects and 2 metrics with the same name you will see only the last one, so that's why I decided to create a new helper function.
There was a problem hiding this comment.
And the second problem, you need manually to set the count - I prefer to get all metrics - since we have this assert_empty and then return data.
| "name": "c:transactions/count_per_root_project@none", | ||
| "type": "c", | ||
| "value": 3.0, | ||
| } |
There was a problem hiding this comment.
Thanks, no need to add a separate test if it's already covered.
tests/integration/test_metrics.py
Outdated
| "foo": {"value": 1.2}, | ||
| "bar": {"value": 1.3}, | ||
| } | ||
| relay.send_transaction(41, transaction, transaction_from_dsc="test") |
There was a problem hiding this comment.
Instead of sending this transaction to project 41, I would send it to project 42 with project_id=41 in the DSC. And then verify that the metric has project 41, not 42. This is what I meant originally with
a test with a DSC where the DSC's project ID is different from the event's project ID.
jjbayer
left a comment
There was a problem hiding this comment.
I approve this PR. Please notify SnS when this change goes live, so they can observe the increase in stored metrics / tags.
* master: feat(metrics): Count transactions toward root project [TET-627] (#1734)
#42939) This PR implements prioritize by project bias. In detail: We run celery task every 24 at 8:00AM (UTC randomly selected) for every ORG (we call it *prioritise by project snuba query* ) and all projects inside this org, and for a given combination of org and projects run an adjustment model to recalculate sample rates if necessary. Then we cache sample rate using redis cluster -> `SENTRY_DYNAMIC_SAMPLING_RULES_REDIS_CLUSTER` using this pattern for key: `f"ds::o:{org_id}:p:{project_id}:prioritise_projects"`. When relay fetches `projectconfig` endpoint we run `generate_rules` functions to generate all dynamic sampling biases, so and we check if we have adjusted sample rate for this project in the cache, so we apply it as **uniform bias**, otherwise we use default one. Regarding *prioritize by project snuba query* is cross org snuba query that utilizes a new generic counter metric, which was introduced in [relay]( getsentry/relay#1734) `c:transactions/count_per_root_project@none`. TODO: - [x] Provision infrastructure to run clickhouse clusters for the counters tables. This is primarily dependent on ops - [x] Start running the snuba consumers to read and write to the counters table. SnS can work on this - [x] Add unit-tests; - [x] Update snuba query using new metric - [x] Hide behind feature flag related PRs: - Implement new metric in relay: getsentry/relay#1734 - Add org generic counters [TET-695] getsentry/snuba#3708 - Introduce new storages for counters in snuba getsentry/snuba#3679 - Add feature flag: https://github.com/getsentry/getsentry/pull/9323 - Add cross organization methods for the string indexer #45076 #45076 [TET-695]: https://getsentry.atlassian.net/browse/TET-695?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ --------- Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com> Co-authored-by: Nar Saynorath <nar.saynorath@sentry.io>
This PR introduce new metric
count_per_root_project, similar toduration,but we associate it with root_project id and store as
sampling_metricsvector (similar name we do in another places).So we can use it in outcomes to know for example:
Frontend generates 8M (direct) + 120M (backend), so
count_per_root_project = 128Mfor dependent projects.or fallback to
count(duration)==count_per_root_projectfor independent projects.The metric is tagged with the transaction name from the trace header, which should be low cardinality (SDKs should not set it when transaction source is
url).TODO:
related PR getsentry/sentry#43167