Skip to content

feat(metrics): Count transactions toward root project [TET-627]#1734

Merged
jjbayer merged 27 commits intomasterfrom
feat/metrics-count-tx-for-root
Feb 2, 2023
Merged

feat(metrics): Count transactions toward root project [TET-627]#1734
jjbayer merged 27 commits intomasterfrom
feat/metrics-count-tx-for-root

Conversation

@jjbayer
Copy link
Member

@jjbayer jjbayer commented Jan 10, 2023

This PR introduce new metric count_per_root_project, similar to duration,
but we associate it with root_project id and store as sampling_metrics vector (similar name we do in another places).

So we can use it in outcomes to know for example:

Frontend generates 8M (direct) + 120M (backend), so count_per_root_project = 128M for dependent projects.

or fallback to count(duration) == count_per_root_project for independent projects.

The metric is tagged with the transaction name from the trace header, which should be low cardinality (SDKs should not set it when transaction source is url).

TODO:

  • Make it compile
  • Fix rust tests
  • Fix python integration tests

related PR getsentry/sentry#43167

@andriisoldatenko andriisoldatenko changed the title feat(metrics): Count transactions toward root project feat(metrics): Count transactions toward root project [TET-627] Jan 10, 2023
@andriisoldatenko andriisoldatenko marked this pull request as ready for review January 11, 2023 15:33
@andriisoldatenko andriisoldatenko requested review from a team and untitaker January 11, 2023 15:33
#[serde(flatten)]
value: BucketValue,
timestamp: UnixTimestamp,
#[serde(skip_serializing_if = "BTreeMap::is_empty")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am reviewing this change only. i think we can revisit the schema we actually want later, but as pointed out in private conversations, the metrics indexer currently does not handle the absence of tags correctly.

Copy link
Member Author

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me. Let's make sure Search and Storage is aware that we are adding a counter metric before we deploy this.

@andriisoldatenko andriisoldatenko force-pushed the feat/metrics-count-tx-for-root branch from c932995 to c727b43 Compare January 16, 2023 08:35
@andriisoldatenko andriisoldatenko self-requested a review January 16, 2023 12:25
@andriisoldatenko andriisoldatenko force-pushed the feat/metrics-count-tx-for-root branch from 63432ac to 0849c6e Compare January 26, 2023 10:05
@andriisoldatenko andriisoldatenko force-pushed the feat/metrics-count-tx-for-root branch from 0849c6e to ddcb598 Compare January 30, 2023 12:25
// requires recomputation of the context.
state.envelope_context.update(&state.envelope);

let has_metrics = state.extracted_metrics.project_metrics.is_empty();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let has_metrics = state.extracted_metrics.project_metrics.is_empty();
let has_metrics = !state.extracted_metrics.project_metrics.is_empty();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am surprised that no test case caught this.


assert metrics.keys() == {
"d:transactions/duration@millisecond",
"c:transactions/count_per_root_project@none",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, these test cases all add this metric to current project, because there is no DSC involved. Please also add tests for the following:

  • a test with a DSC where the DSC's project ID is different from the event's project ID.
  • a test where we assert that the DSC's transaction field is used as a tag on the metric.

CHANGELOG.md Outdated
- Add error and sample rate fields to the replay event parser. ([#1745](https://github.com/getsentry/relay/pull/1745))
- Add `instruction_addr_adjustment` field to `RawStacktrace`. ([#1716](https://github.com/getsentry/relay/pull/1716))
- Add SSL support to `relay-redis` crate. It is possible to use `rediss` scheme to connnect to Redis cluster using TLS. ([#1772](https://github.com/getsentry/relay/pull/1772))
- Add count transactions toward root project. ([#1734](https://github.com/getsentry/relay/pull/1734))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, move this to Unreleased section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olksdr done.

fn process_sessions(&self, state: &mut ProcessEnvelopeState) {
let received = state.envelope_context.received_at();
let extracted_metrics = &mut state.extracted_metrics;
let extracted_metrics = &mut &mut state.extracted_metrics.project_metrics;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this work just with taking one mutable reference?

Suggested change
let extracted_metrics = &mut &mut state.extracted_metrics.project_metrics;
let extracted_metrics = &mut state.extracted_metrics.project_metrics;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! yeah it works without second &mut

metrics: &mut Vec<Metric>, // output parameter
metrics: &mut Vec<Metric>, // output parameter
sampling_metrics: &mut Vec<Metric>, // output parameter
transaction_from_dsc: Option<&str>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Because metrics and sampling_metrics are output parameters, I would put them at the end of the argument list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed afa6576

"name": "c:transactions/count_per_root_project@none",
"type": "c",
"value": 3.0,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this use case already tested by test_transaction_metrics? Can we alter this test so that this metric is emitted with project_id: 41, where 41 is the root project. We could alter the signature of send_transaction to accept an explicit trace header:

relay.send_transaction(42, transaction, trace_header={"public_key": public_key_of_project_41, "transaction": "root_transaction"})

It would also be good to have one test case where count_per_root_project is emitted without a transaction tag (maybe I missed it).

Copy link
Contributor

@andriisoldatenko andriisoldatenko Feb 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with you suggestion,

regarding:

It would also be good to have one test case where count_per_root_project is emitted without a transaction tag (maybe I missed it).

I think we cover it separately in other tests where we can see tags = {}, but I can write separate test for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, no need to add a separate test if it's already covered.

return metrics


def metrics_by_name_group_by_project(metrics_consumer, count, timeout=None):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper function does not seem to be using count at all. I feel like this function could call metrics_by_name internally, or vice versa?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count is huge antipattern IMO, I'll remove it from args :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me explain, this helper has small problem:

def metrics_by_name(metrics_consumer, count, timeout=None):
    metrics = {}

    for _ in range(count):
        metric = metrics_consumer.get_metric(timeout)
        metrics[metric["name"]] = metric

    metrics_consumer.assert_empty()
    return metrics

if you have 2 projects and 2 metrics with the same name you will see only the last one, so that's why I decided to create a new helper function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the second problem, you need manually to set the count - I prefer to get all metrics - since we have this assert_empty and then return data.

"name": "c:transactions/count_per_root_project@none",
"type": "c",
"value": 3.0,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, no need to add a separate test if it's already covered.

"foo": {"value": 1.2},
"bar": {"value": 1.3},
}
relay.send_transaction(41, transaction, transaction_from_dsc="test")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of sending this transaction to project 41, I would send it to project 42 with project_id=41 in the DSC. And then verify that the metric has project 41, not 42. This is what I meant originally with

a test with a DSC where the DSC's project ID is different from the event's project ID.

Andrii Soldatenko added 2 commits February 2, 2023 09:33
Copy link
Member Author

@jjbayer jjbayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this PR. Please notify SnS when this change goes live, so they can observe the increase in stored metrics / tags.

Copy link
Contributor

@iambriccardo iambriccardo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@olksdr olksdr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jjbayer jjbayer merged commit 57aacf0 into master Feb 2, 2023
@jjbayer jjbayer deleted the feat/metrics-count-tx-for-root branch February 2, 2023 12:46
jan-auer added a commit that referenced this pull request Feb 2, 2023
* master:
  feat(metrics): Count transactions toward root project [TET-627] (#1734)
andriisoldatenko pushed a commit to getsentry/sentry that referenced this pull request Feb 28, 2023
#42939)

This PR implements prioritize by project bias.

In detail: 
We run celery task every 24 at 8:00AM (UTC randomly selected) for every
ORG (we call it *prioritise by project snuba query* ) and all projects
inside this org, and for a given combination of org and projects run an
adjustment model to recalculate sample rates if necessary.

Then we cache sample rate using redis cluster ->
`SENTRY_DYNAMIC_SAMPLING_RULES_REDIS_CLUSTER` using this pattern for
key: `f"ds::o:{org_id}:p:{project_id}:prioritise_projects"`.

When relay fetches `projectconfig` endpoint we run `generate_rules`
functions to generate all dynamic sampling biases, so and we check if we
have adjusted sample rate for this project in the cache, so we apply it
as **uniform bias**, otherwise we use default one.

Regarding *prioritize by project snuba query* is cross org snuba query
that utilizes a new generic counter metric, which was introduced in
[relay]( getsentry/relay#1734)
`c:transactions/count_per_root_project@none`.

TODO:
- [x] Provision infrastructure to run clickhouse clusters for the
counters tables. This is primarily dependent on ops
- [x] Start running the snuba consumers to read and write to the
counters table. SnS can work on this
- [x] Add unit-tests;
- [x] Update snuba query using new metric
- [x] Hide behind feature flag 


related PRs:
- Implement new metric in relay:
getsentry/relay#1734
- Add org generic counters [TET-695]
getsentry/snuba#3708
- Introduce new storages for counters in snuba
getsentry/snuba#3679
- Add feature flag: https://github.com/getsentry/getsentry/pull/9323
- Add cross organization methods for the string indexer #45076
#45076

[TET-695]:
https://getsentry.atlassian.net/browse/TET-695?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

---------

Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
Co-authored-by: Nar Saynorath <nar.saynorath@sentry.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants