feat(occurrences on eap): Implement double reads of bucketed group counts #105691

shashjar · 2026-01-05T23:00:39Z

Adds EAP double-reading support for the query_groups_past_counts function used in escalating issue detection.

Changes:

Implemented _query_groups_past_counts_eap that queries hourly event counts grouped by project_id and group_id
Updated query_groups_past_counts to use the double-read pattern with EAPOccurrencesComparator

Ran some test queries to compare results on local devserver:

============================================================
EAP Query Groups Past Counts - Manual Test
============================================================

============================================================
Testing organization: Sentry (id=1)
============================================================

Found 19 groups to test
Group IDs: [19, 18, 17, 16, 15, 14, 13, 12, 11, 10]...
Error groups: 19, Other escalatable groups: 19

--- Testing Snuba implementation ---
Snuba returned 4 results
Sample result: {'project_id': 3, 'group_id': 8, 'hourBucket': '2026-01-05T23:00:00+00:00', 'count()': 11}

--- Testing EAP implementation ---
23:53:57 [INFO] sentry.utils.snuba_rpc: Running a EndpointTimeSeries RPC query (rpc_query={'meta': {'organizationId': '1', 'referrer': 'sentry.issues.escalating', 'projectIds': ['1', '2', '3'], 'startTimestamp': '2025-12-29T23:00:00Z', 'endTimestamp': '2026-01-06T00:00:00Z', 'traceItemType': 'TRACE_ITEM_TYPE_OCCURRENCE', 'downsampledStorageConfig': {'mode': 'MODE_NORMAL'}}, 'filter': {'andFilter': {'filters': [{'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '19'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '18'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '17'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '12'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '11'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '10'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '9'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '8'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '16'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '15'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '14'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '13'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '2'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '7'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '6'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '5'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '4'}}}, {'orFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '3'}}}, {'comparisonFilter': {'key': {'type': 'TYPE_INT', 'name': 'group_id'}, 'op': 'OP_EQUALS', 'value': {'valInt': '1'}}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}]}}, {'andFilter': {'filters': [{'comparisonFilter': {'key': {'type': 'TYPE_DOUBLE', 'name': 'sentry.timestamp'}, 'op': 'OP_GREATER_THAN_OR_EQUALS', 'value': {'valInt': '1767052437'}}}, {'comparisonFilter': {'key': {'type': 'TYPE_DOUBLE', 'name': 'sentry.timestamp'}, 'op': 'OP_LESS_THAN', 'value': {'valInt': '1767657237'}}}]}}]}}, 'granularitySecs': '3600', 'groupBy': [{'type': 'TYPE_INT', 'name': 'sentry.project_id'}, {'type': 'TYPE_INT', 'name': 'group_id'}], 'expressions': [{'aggregation': {'aggregate': 'FUNCTION_COUNT', 'key': {'type': 'TYPE_INT', 'name': 'sentry.project_id'}, 'label': 'count()', 'extrapolationMode': 'EXTRAPOLATION_MODE_SAMPLE_WEIGHTED'}, 'label': 'count()'}]} referrer='sentry.issues.escalating' organization_id=1 trace_item_type=7)
23:53:57 [INFO] sentry.utils.snuba_rpc: Timeseries RPC query response (rpc_rows=169 organization_id=1 meta=request_id: "e080d91837454cce94a73d6b9934fe2b"
query_info {
  stats {
    progress_bytes: 1488
  }
  metadata {
  }
}
downsampled_storage_meta {
}
)
EAP returned 4 results
Sample result: {'project_id': 3, 'group_id': 8, 'hourBucket': '2026-01-05T23:00:00+00:00', 'count()': 11}

--- Comparison ---
Total Snuba results: 4
Total EAP results: 4
Matching entries: 4
Mismatching entries: 0
Only in Snuba: 0
Only in EAP: 0

--- Ordering Check ---
✅ Results are in the same order (exact match)

--- Summary ---
✅ SUCCESS: Results match perfectly (data and ordering)!

…currences

…nto double-read-query-groups-past-counts-eap

src/sentry/issues/escalating/escalating.py

shashjar · 2026-01-05T23:56:53Z

bugbot run

wedamija

Does it matter much that we're adding a couple more queries here and slowing down processing? If so, it could be worth delegating the extra queries to a separate thread

wedamija · 2026-01-07T20:14:23Z

src/sentry/issues/escalating/escalating.py

+        results = EAPOccurrencesComparator.check_and_choose(
+            snuba_results,
+            eap_results,
+            "issues.escalating.query_groups_past_counts",
+            is_experimental_data_a_null_result=len(eap_results) == 0,
+        )


It might be helpful to include a reasonable match comparator or something along those lines. I think that we're likely to have small differences between the two systems just due to the fact that we don't ingest data at the same rate. It'd be good to have some kind of similarity % to compare even better

I didn't include a reasonable match comparator in this case (since it's a little more involved), but we have that on some of the other read paths we've migrated so far. Right now we're only issuing reads in S4S (where we've been writing to EAP for almost 3 months at this point, so we should be almost at the point where the two data stores are in parity for groups in retention).

Re increased latency by adding more queries: yep this is on my todo list. We've only migrated a handful of read paths over to double reads so far, but as we do more we'll just generally be adding latency across the app which is not ideal. I'm thinking that the best solution here is to switch to a sampling approach (where we double read only a portion of the time) rather than double reading on every query, so that we'll still get representative metrics for comparison but reduce the latency impact.

Yeah, the sampling approach sounds decent. It's also not too hard to delegate to threads if you want to go down that path

In terms of data accuracy - historically it'll probably match up, but recently ingested data can be out of sync since there's no guarantee that writes to EAP are synced up with writes to snuba. It'll be a small discrepancy, but might be worth taking into account if you're seeing some queries not match up. Something you can handle later on, anyway

src/sentry/issues/escalating/escalating.py

thetruecpaul

Generally LGTM, one Q

thetruecpaul · 2026-01-12T21:23:41Z

src/sentry/issues/escalating/escalating.py

+    all_results += _query_groups_eap_by_org(error_groups)
+    all_results += _query_groups_eap_by_org(other_groups)


Q here: why not make a single call with error_groups + other_groups?

This was only to keep parity with the Snuba implementation, since that query orders the results such that error groups come first and then issue platform groups. Of course we don't have the same distinction in EAP, so we could definitely combine the queries, but then there's no guarantee that we get results in the same order as Snuba and therefore have exact_match on the comparator be informative.

Maybe we could keep this double query for the rollout to make sure we maintain parity and I can make a ticket to eventually condense this into one query?

Sure, that sounds good! Definitely not a blocker.

shashjar added 2 commits January 5, 2026 14:05

Implement timeseries query support for Occurrences EAP RPCs

eb7ecdd

Implement double reads of bucketed group counts from EAP

82806b7

vercel bot deployed to Preview January 5, 2026 23:00 View deployment

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jan 5, 2026

shashjar added 2 commits January 5, 2026 15:07

Correctly merge multiple aggregates in grouped timeseries query on oc…

11a5d32

…currences

Merge branch 'implement-timeseries-query-logic-for-eap-occurrences' i…

03540b9

…nto double-read-query-groups-past-counts-eap

shashjar requested review from a team and thetruecpaul January 5, 2026 23:09

shashjar marked this pull request as ready for review January 5, 2026 23:10

shashjar requested a review from a team as a code owner January 5, 2026 23:10

shashjar removed the request for review from a team January 5, 2026 23:10

vercel bot deployed to Preview January 5, 2026 23:12 View deployment

cursor bot reviewed Jan 5, 2026

View reviewed changes

src/sentry/issues/escalating/escalating.py Show resolved Hide resolved

Append error and other results separately for sorting

34ae32a

vercel bot deployed to Preview January 5, 2026 23:59 View deployment

wedamija approved these changes Jan 7, 2026

View reviewed changes

Base automatically changed from implement-timeseries-query-logic-for-eap-occurrences to master January 10, 2026 00:08

Merge branch 'master' into double-read-query-groups-past-counts-eap

5559fb0

vercel bot deployed to Preview January 10, 2026 00:12 View deployment

Merge branch 'master' into double-read-query-groups-past-counts-eap

d812d30

vercel bot deployed to Preview January 12, 2026 21:00 View deployment

sentry bot reviewed Jan 12, 2026

View reviewed changes

src/sentry/issues/escalating/escalating.py Show resolved Hide resolved

thetruecpaul approved these changes Jan 12, 2026

View reviewed changes

shashjar merged commit 9e24416 into master Jan 13, 2026
66 checks passed

shashjar deleted the double-read-query-groups-past-counts-eap branch January 13, 2026 20:51

github-actions bot mentioned this pull request Jan 13, 2026

ci(release): Switch from action-prepare-release to Craft #106005

Open

		all_results += _query_groups_eap_by_org(error_groups)
		all_results += _query_groups_eap_by_org(other_groups)

Uh oh!

feat(occurrences on eap): Implement double reads of bucketed group counts #105691

feat(occurrences on eap): Implement double reads of bucketed group counts #105691

Uh oh!

Conversation

shashjar commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

shashjar commented Jan 5, 2026

Uh oh!

wedamija left a comment

Choose a reason for hiding this comment

Uh oh!

wedamija Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

shashjar Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

wedamija Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

wedamija Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thetruecpaul left a comment

Choose a reason for hiding this comment

Uh oh!

thetruecpaul Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

shashjar Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

thetruecpaul Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shashjar commented Jan 5, 2026 •

edited

Loading