Conversation
Refactor group model to always use events from Snuba instead of Postgres
762f5a1 to
338616d
Compare
| }, { | ||
| 'default': ('Sentry Issue: [%s](%s)\n\n```\n' | ||
| 'Stacktrace (most recent call last):\n\n ' | ||
| 'Stacktrace (most recent call first):\n\n ' |
There was a problem hiding this comment.
I'm not actually sure, @tkaemming - do you know?
There was a problem hiding this comment.
Hm... seems like that comes from here. Seems like the changes from create_group/create_event maybe triggered something like the platform to be saved differently? (I don't know if that's the case, I didn't follow the fixture creation down to the source.)
| issue_id=self.id, | ||
| project_id=self.project_id) | ||
|
|
||
| def get_oldest_event(self): |
There was a problem hiding this comment.
Could plugins in sentry-plugins or other plugin repos be using this?
There was a problem hiding this comment.
Good point. I checked the whole Sentry org on github and there doesn't seem to be any other usages of this
5070c60 to
d941aa6
Compare
d941aa6 to
3d3aa8b
Compare
| # Look up event_id in both Event and EventMapping, | ||
| # and bail when it matches one of them, prioritizing | ||
| # Event since it contains more history. | ||
| for model in Event, EventMapping: |
There was a problem hiding this comment.
I don't think we can remove the EventMapping check yet.
EventMapping is saved if the event is sampled (see
sentry/src/sentry/event_manager.py
Lines 717 to 731 in 618bb0c
Even after removing events from postgres, eventmapping is going to stay.
There was a problem hiding this comment.
Same as below, won't all the events be in Snuba?
| break | ||
| except IndexError: | ||
| pass | ||
| event = SnubaEvent.objects.from_event_id(event_id, project.id) |
There was a problem hiding this comment.
Before merging this, could you please check who is using the method and ensure we won't increase the load on snuba too much. Which means ensure this does not happen in the ingestion flow.
There was a problem hiding this comment.
Could you please provide some details about the research you have done ? What did you look into ?
There was a problem hiding this comment.
Yeah sorry, this function seems to be just used in these 3 places, none of them are related to ingestion
src/sentry/web/frontend/error_page_embed.py: report.group = Group.objects.from_event_id(report.project, report.event_id)
src/sentry/api/endpoints/project_group_index.py: matching_group = Group.objects.from_event_id(project, event_id)
src/sentry/api/endpoints/project_user_reports.py: report.group = Group.objects.from_event_id(project, report.event_id)
| return Group.objects.get(id=group_id) | ||
|
|
||
| def filter_by_event_id(self, project_ids, event_id): | ||
| from sentry.models import EventMapping, Event |
There was a problem hiding this comment.
Same issue, I think the check on EventMapping is still valid because it is the only way to fetch sampled events.
There was a problem hiding this comment.
I was under the impression that events are not sampled in Snuba (unlike Postgres) so we would not need to additionally check EventMapping.. is that not correct?
There was a problem hiding this comment.
This is correct, to my knowledge.
I guess if we're going to be introducing a hard dependency on Snuba for events we should probably just go ahead and get rid of all vestiges of sampling (incl. EventMapping.)
| 'event_id': [event_id], | ||
| 'project_id': project_ids, | ||
| }, | ||
| limit=1000, |
There was a problem hiding this comment.
Any reason for this limit specifically. And same question as before. Please ensure we do not go through this code during event ingestion.
There was a problem hiding this comment.
It's just some arbitrarily large number
| @@ -390,21 +374,7 @@ def get_score(self): | |||
| return type(self).calculate_score(self.times_seen, self.last_seen) | |||
|
|
|||
| def get_latest_event(self): | |||
There was a problem hiding this comment.
This may be a little tricker to remove.
I saw it is used in the plugin code:
src/sentry/plugins/base/v1.py: event = group.get_latest_event() or Event()
src/sentry/plugins/bases/issue.py: event = group.get_latest_event()
src/sentry/plugins/bases/issue2.py: event = group.get_latest_event()
src/sentry/plugins/bases/issue2.py: event = group.get_latest_event()
src/sentry/tasks/email.py: event = group.get_latest_event()
I can think about two possible issues:
- load on snuba
- the event may have not been stored in snuba while it would already been in postgres. postgres write happens before we execute plugins in post_process.py while at that point the event has just been produced in kafka. so it may or may not have been consumed yet. This should be verified.
There was a problem hiding this comment.
Re 2: We might get away with this since these all seem to be view related functions - the event object is passed to any post process plugin code so we don't need to do .get_latest_event() there
There was a problem hiding this comment.
Hmm, this might come up in slack unfurls
There was a problem hiding this comment.
Actually post_process is synchronized on clickhouse storage. So it should be fine https://github.com/getsentry/sentry/blob/master/src/sentry/eventstream/kafka/consumer.py#L119-L145
| def get_latest_event(self): | ||
| from sentry.models import Event | ||
|
|
||
| if not hasattr(self, '_latest_event'): |
There was a problem hiding this comment.
We dropped this local cache here. Is this intentional ?
There was a problem hiding this comment.
Hmm, I think this was an oversight, will bring this back
fpacifici
left a comment
There was a problem hiding this comment.
I'd suggest to split this PR to reduce the risk and make it easier to review considering the number of possible side effects this code can have.
You could take one method on the Group class per PR so you can focus on a reduced set of possible issues when monitoring production. Also this would make a revert easier in case something goes wrong.
| break | ||
| except IndexError: | ||
| pass | ||
| event = SnubaEvent.objects.from_event_id(event_id, project.id) |
There was a problem hiding this comment.
Could you please provide some details about the research you have done ? What did you look into ?
| return Group.objects.get(id=group_id) | ||
|
|
||
| def filter_by_event_id(self, project_ids, event_id): | ||
| from sentry.models import EventMapping, Event |
| @@ -390,21 +374,7 @@ def get_score(self): | |||
| return type(self).calculate_score(self.times_seen, self.last_seen) | |||
|
|
|||
| def get_latest_event(self): | |||
There was a problem hiding this comment.
Actually post_process is synchronized on clickhouse storage. So it should be fine https://github.com/getsentry/sentry/blob/master/src/sentry/eventstream/kafka/consumer.py#L119-L145
This is the first part of #13905, which updates the group model to use events from Snuba instead of Postgres. This PR just updates the from_event_id function.
This is the second part of #13905, which updates the group model to use events from Snuba instead of Postgres. This PR updates the filter_by_event_id function.
|
Splitting this into a number of smaller PRs: |
Removes the "snuba.events-queries.enabled" option and legacy code paths that were not being used in production. Also removes the unused Group.get_oldest_event method. Part 4 of #13905
This is the fifth (and final) part of #13905 Group model now uses Snuba Event instead of Postgres Event everywhere.
|
thanks for splitting it |
This is the first part of #13905, which updates the group model to use events from Snuba instead of Postgres. This PR just updates the from_event_id function.
This is the second part of #13905, which updates the group model to use events from Snuba instead of Postgres. This PR updates the filter_by_event_id function.
This is the fifth (and final) part of #13905 Group model now uses Snuba Event instead of Postgres Event everywhere.
Removes the "snuba.events-queries.enabled" option and legacy code paths that were not being used in production. Also removes the unused Group.get_oldest_event method. Part 4 of #13905
This is the fifth (and final) part of #13905 Group model now uses Snuba Event instead of Postgres Event everywhere.
This is the second part of #13905, which updates the group model to use events from Snuba instead of Postgres. This PR updates the filter_by_event_id function.
This is the fifth (and final) part of #13905 Group model now uses Snuba Event instead of Postgres Event everywhere.
Refactor group model to always use events from Snuba instead of Postgres.
Rewrite a bunch of tests to use
store_eventinstead ofcreate_eventasthis writes the events into Snuba.