feat(dynamic-sampling): Implement prioritize by project bias [TET-574] by andriisoldatenko · Pull Request #42939 · getsentry/sentry

andriisoldatenko · 2023-01-09T09:43:52Z

This PR implements prioritize by project bias.

In detail:
We run celery task every 24 at 8:00AM (UTC randomly selected) for every ORG (we call it prioritise by project snuba query ) and all projects inside this org, and for a given combination of org and projects run an adjustment model to recalculate sample rates if necessary.

Then we cache sample rate using redis cluster -> SENTRY_DYNAMIC_SAMPLING_RULES_REDIS_CLUSTER using this pattern for key: f"ds::o:{org_id}:p:{project_id}:prioritise_projects".

When relay fetches projectconfig endpoint we run generate_rules functions to generate all dynamic sampling biases, so and we check if we have adjusted sample rate for this project in the cache, so we apply it as uniform bias, otherwise we use default one.

Regarding prioritize by project snuba query is cross org snuba query that utilizes a new generic counter metric, which was introduced in relay c:transactions/count_per_root_project@none.

TODO:

Provision infrastructure to run clickhouse clusters for the counters tables. This is primarily dependent on ops
Start running the snuba consumers to read and write to the counters table. SnS can work on this
Add unit-tests;
Update snuba query using new metric
Hide behind feature flag

related PRs:

Implement new metric in relay: feat(metrics): Count transactions toward root project [TET-627] relay#1734
Add org generic counters TET-695 feat: add org generic counters [TET-695] snuba#3708
Introduce new storages for counters in snuba feat(gen-metrics-counters) Introduce new storages for counters snuba#3679
Add feature flag: https://github.com/getsentry/getsentry/pull/9323
Add cross organization methods for the string indexer feat(indexer): Add cross organisation methods for the string indexer #45076 feat(indexer): Add cross organisation methods for the string indexer #45076

github-actions · 2023-01-16T08:04:20Z

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

Tracking the file size so we can understand its impact on timing

…jects

andriisoldatenko · 2023-02-27T16:23:21Z

Tested locally with 1 org 1 project - no changes:

15:33:18 worker                                | 15:33:18 [INFO] sentry.dynamic_sampling: rules_generator.generate_rules (org_id=1 project_id=1 rules=[{'id': 1002, 'type': 'ignoreHealthChecks', 'samplingValue': {'type': 'sampleRate', 'value': 0.05}, 'healthChecks': ['*healthcheck*', '*healthy*', '*live*', '*ready*', '*heartbeat*', '*/health', '*/healthz']}, {'id': 1000, 'type': 'uniformRule', 'samplingValue': {'type': 'sampleRate', 'value': 0.25}}])

With 1 org and 2 projects (blended rate - 0.25):

16:18:01 worker                                | 16:18:01 [INFO] sentry: monitor.missed-checkin (monitor_id=4)
16:18:01 worker                                | 16:18:01 [INFO] sentry: monitor.missed-checkin (monitor_id=1)
16:18:01 worker                                | 16:18:01 [INFO] sentry.auth: !!!!!!!!!!!!!!!!!!!!!!!!!!check_auth
16:18:01 worker                                | 16:18:01 [INFO] sentry.dynamic_sampling.tasks: !!! start prioritise_projects
16:18:01 worker                                | 16:18:01 [INFO] sentry.dynamic_sampling.tasks: !!! 1 [(1, 1452.0), (8, 100.0)]
16:18:01 worker                                | 16:18:01 [INFO] sentry.dynamic_sampling.tasks: !!! start process_projects_sample_rates
16:18:01 worker                                | 16:18:01 [WARNING] sentry.tasks.release_registry: Release registry URL is not specified, skipping the task.
16:18:04 worker                                | 16:18:04 [INFO] sentry.tasks.groupowner: process_suspect_commits.skipped (reason='no_release')
16:18:04 worker                                | 16:18:04 [INFO] sentry.tasks.groupowner: process_suspect_commits.skipped (reason='no_release')
16:18:07 worker                                | 16:18:07 [INFO] sentry.dynamic_sampling: rules_generator.generate_rules (org_id=1 project_id=1 rules=[{'id': 1002, 'type': 'ignoreHealthChecks', 'samplingValue': {'type': 'sampleRate', 'value': 0.03966942148760331}, 'healthChecks': ['*healthcheck*', '*healthy*', '*live*', '*ready*', '*heartbeat*', '*/health', '*/healthz']}, {'id': 1000, 'type': 'uniformRule', 'samplingValue': {'type': 'sampleRate', 'value': 0.19834710743801653}}])

onewland · 2023-02-28T00:35:12Z

src/sentry/dynamic_sampling/prioritise_projects.py

+                ],
+                groupby=[Column("org_id"), Column("project_id")],
+                where=[
+                    Condition(Function("modulo", [Column("org_id"), 100]), Op.LT, sample_rate),


This might work fine but I'm not sure how efficient WHERE org_id % 100 < 45 (or some other n) will be at scanning the table.

@nikhars do you have an opinion on this? I think given the data size and the filtering by metric_id it would probably work but I wonder if they'd get better ClickHouse performance if they enumerated the org_id's to check into batches of 5k or 10k and filtering before sending the query

I can run the query and provide information whether this sort of WHERE clause would be helpful or not.

…jects

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jan 9, 2023

andriisoldatenko changed the title ~~feat(sampling): Implement prioritize by project bias [TET-569]~~ feat(dynamic-sampling): Implement prioritize by project bias [TET-569] Jan 9, 2023

vercel bot deployed to Preview – storybook January 9, 2023 09:49 View deployment

vercel bot deployed to Preview – sentry January 9, 2023 09:50 View deployment

github-actions bot added the Scope: Frontend Automatically applied to PRs that change frontend components label Jan 16, 2023

vercel bot deployed to Preview – storybook January 16, 2023 08:07 View deployment

vercel bot deployed to Preview – sentry January 16, 2023 08:07 View deployment

andriisoldatenko force-pushed the andrii/implement-cronjob-to-fetch-orgs-projects branch from 77f6033 to b074f9a Compare January 16, 2023 08:10

vercel bot deployed to Preview – sentry January 16, 2023 08:15 View deployment

vercel bot deployed to Preview – storybook January 16, 2023 08:18 View deployment

vercel bot deployed to Preview – sentry January 16, 2023 10:27 View deployment

vercel bot deployed to Preview – storybook January 16, 2023 10:28 View deployment

vercel bot deployed to Preview – sentry January 19, 2023 12:23 View deployment

vercel bot deployed to Preview – storybook January 19, 2023 12:23 View deployment

andriisoldatenko force-pushed the andrii/implement-cronjob-to-fetch-orgs-projects branch from 8532860 to 6071553 Compare January 22, 2023 16:55

vercel bot deployed to Preview – sentry January 22, 2023 17:00 View deployment

vercel bot deployed to Preview – storybook January 22, 2023 17:01 View deployment

andriisoldatenko force-pushed the andrii/implement-cronjob-to-fetch-orgs-projects branch from 6071553 to e4ed330 Compare January 26, 2023 17:01

vercel bot deployed to Preview – storybook January 26, 2023 17:11 View deployment

vercel bot deployed to Preview – sentry January 26, 2023 17:13 View deployment

vercel bot deployed to Preview – sentry January 26, 2023 17:33 View deployment

vercel bot deployed to Preview – storybook January 26, 2023 17:35 View deployment

andriisoldatenko force-pushed the andrii/implement-cronjob-to-fetch-orgs-projects branch from b81e3bd to 955c3bf Compare February 7, 2023 12:44

vercel bot deployed to Preview – sentry February 7, 2023 12:47 View deployment

vercel bot deployed to Preview – storybook February 7, 2023 12:47 View deployment

andriisoldatenko force-pushed the andrii/implement-cronjob-to-fetch-orgs-projects branch from 955c3bf to 1c0bbcf Compare February 12, 2023 17:36

vercel bot deployed to Preview – sentry February 12, 2023 17:40 View deployment

vercel bot deployed to Preview – storybook February 12, 2023 17:41 View deployment

vercel bot deployed to Preview – sentry February 13, 2023 09:18 View deployment

Andrii Soldatenko and others added 26 commits February 24, 2023 12:56

add more tests

0cfbc82

add more tests

cfa011c

add more tests

10af365

add more tests

6a0822d

fix mypy

2e017e3

add more tests

fb7414a

remove second utils.py

4ad92cf

rename to projects_with_tx_count

74876cf

fix MOCK_DATETIME

86e8bbe

chore(view-hierarchy): Add proguard file size to span (#44826)

eb970b5

Tracking the file size so we can understand its impact on timing

switch to redis HSET, rename test

baef1cf

undo incorrect change

e8f066f

add pexpire key

a4af4f4

refactor celery task

a5a81c0

add timeouts to celery tasks

441d6a5

refactor using redis pipelines

49ec087

remove "c:transactions/count_per_root_project@none" from relay list

6019c9b

add option to filter amount of orgs

bbdd985

add more tests

365f3d4

fix tests

c4f28ae

Merge branch 'master' into andrii/implement-cronjob-to-fetch-orgs-pro…

3647afc

…jects

Merge branch 'master' into andrii/implement-cronjob-to-fetch-orgs-pro…

9f7ef73

…jects

Merge branch 'master' into andrii/implement-cronjob-to-fetch-orgs-pro…

bfa978c

…jects

rewrite using indexer.resolve_shared_org()

921be06

fix tests

b3ce62e

update referrer for prioritise by project bias

08b8fa5

onewland reviewed Feb 28, 2023

View reviewed changes

Andrii Soldatenko added 2 commits February 28, 2023 10:30

Merge branch 'master' into andrii/implement-cronjob-to-fetch-orgs-pro…

1693b83

…jects

run prioritise_by_projects cron job every 1 hour

382f62e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(dynamic-sampling): Implement prioritize by project bias [TET-574]#42939

feat(dynamic-sampling): Implement prioritize by project bias [TET-574]#42939
andriisoldatenko merged 43 commits intomasterfrom
andrii/implement-cronjob-to-fetch-orgs-projects

andriisoldatenko commented Jan 9, 2023 •

edited

Loading

Uh oh!

github-actions bot commented Jan 16, 2023

Uh oh!

andriisoldatenko commented Feb 27, 2023

Uh oh!

onewland Feb 28, 2023 •

edited

Loading

Uh oh!

nikhars Feb 28, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Uh oh!

Conversation

andriisoldatenko commented Jan 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jan 16, 2023

Uh oh!

andriisoldatenko commented Feb 27, 2023

Uh oh!

onewland Feb 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikhars Feb 28, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

andriisoldatenko commented Jan 9, 2023 •

edited

Loading

onewland Feb 28, 2023 •

edited

Loading