-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
feat(code_review): Support sending webhook events to Seer #105393
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3b5f6de to
e9d2e29
Compare
❌ 1 Tests Failed:
View the top 1 failed test(s) by shortest run time
To view more test analytics, go to the Test Analytics Dashboard |
suejung-sentry
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These staged changes make sense to me and behind those flags sound like a good way to test it out little by little
src/sentry/seer/code_review/utils.py
Outdated
|
|
||
| # Determine request_type based on event_type | ||
| # For now, we only support pr-review for these webhook types | ||
| request_type: Literal["pr-review", "unit-tests", "pr-closed"] = "pr-review" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use this opportunity to sunset unit-tests feature. This used to be the test generation feature that we will not take to GA
|
|
||
| # Build RepoDefinition | ||
| repo_definition = { | ||
| "provider": "github", # All GitHub webhooks use "github" provider |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A warning/heads up there can be confusion about when this value should be github or integrations:github (the latter being the key we need to query within various Integrations team related endpoints). Looks like in the case of the payload to seer either is acceptable and there is some stripping that happens here.
So fine as you have it, but a word of warning this could be a source of bugs
src/sentry/seer/code_review/utils.py
Outdated
| "provider": "github", # All GitHub webhooks use "github" provider | ||
| "owner": repository["owner"]["login"], | ||
| "name": repository["name"], | ||
| "external_id": str(repository["id"]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An important field here is base_commit_sha which should be the commit we are analyzing or head SHA in the case of a pull request (here)
| "repo": repo_definition, | ||
| "pr_id": pr_number, | ||
| "codecov_status": None, | ||
| "more_readable_repos": [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An important field is also bug_prediction_specific_information.organization_id (here) and maybe also organization_slug. We can get rid of the rest of the fields in that bug_prediction_specific_information object.
Would be nice to clean up this payload and bring that field up to the top level.
The other one we need is config (here). We can clean this up a bunch too based on everything that's changed for GA.
"config": {
"bug_prediction_enabled": true
"trigger": "on_new_commit"
}
These were also added for metrics - trigger_comment_id, trigger_comment_type, trigger_user
Maybe we'd consider nesting that under a "trigger_metadata" key or something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's an example payload of the minimum fields afaik
{
"request_type": "pr-review",
"external_owner_id": "176330896",
"data": {
"repo": {
"provider": "github",
"owner": "suejung-org",
"name": "hello-vite",
"external_id": "176330896",
"base_commit_sha": "8fac7540fe09c9041fa9f254fc3af1a72300dfb8"
},
"pr_id": 4,
"bug_prediction_specific_information": {
"organization_id": 1
},
"config": {
"features": {
"bug_prediction": true
},
"trigger": "on_new_commit",
"trigger_comment_id": .., // for metrics
"trigger_comment_type": .., // for metrics
"trigger_user": ... // for metrics
}
}
}
| "owner": repository["owner"]["login"], | ||
| "name": repository["name"], | ||
| "external_id": str(repository["id"]), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing required base_commit_sha field in repo definition
The repo_definition dictionary in _transform_webhook_to_codegen_request is missing the base_commit_sha field, which was explicitly identified as an important required field in the PR discussion. According to the reviewer, this should be "the commit we are analyzing or head SHA in the case of a pull request." For pull request events, this value is available at event_payload["pull_request"]["head"]["sha"]. Without this field, Seer won't know which commit to analyze, causing the code review feature to fail or produce incorrect results.
Changes included: * `_handle` to process registered webhook event processors * Declare `EVENT_TYPE` when subclassing
| logger = logging.getLogger("sentry.webhooks") | ||
|
|
||
|
|
||
| class WebhookProcessor(Protocol): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signature all GitHub webhook handlers need to have.
| return user_type != "Bot" | ||
|
|
||
|
|
||
| def _handle_pr_webhook_for_autofix_processor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we would port this change to handle_github_pr_webhook_for_autofix itself.
|
|
||
| if integration is not None: | ||
| self._handle(integration, event, org_integrations=org_integrations) | ||
| self._handle_organization_deletion( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it is not a true subclassing of the _handle we can rename it.
| String values preserve original GitHub naming for backward compatibility with existing metrics. | ||
| """ | ||
|
|
||
| CI_CHECK = "ci_check" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We rename the value from check_run since we have just implemented this feature in the last two weeks, it's barely used and I don't care if we have a discontinuity somewhere in the GitHub metrics.
| String values preserve original GitHub naming for backward compatibility with existing metrics. | ||
| """ | ||
|
|
||
| CI_CHECK = "ci_check" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metrics value change contradicts backward compatibility comment
The docstring states "String values preserve original GitHub naming for backward compatibility with existing metrics," but CI_CHECK has value "ci_check" instead of the original GitHub value "check_run". The old code used CHECK_RUN = "check_run". This breaking change will cause existing metrics dashboards tracking "check_run" events to stop receiving data, contradicting the documented backward compatibility guarantee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intentional.
| String values preserve original GitHub naming for backward compatibility with existing metrics. | ||
| """ | ||
|
|
||
| CI_CHECK = "ci_check" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm renaming this metric because is so new and I don't care about breaking continuity.
| String values preserve original GitHub naming for backward compatibility with existing metrics. | ||
| """ | ||
|
|
||
| CI_CHECK = "ci_check" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the tests into test_webhooks.py.
| # These are always forwarded | ||
| assert GithubWebhookType.INSTALLATION in events | ||
| assert GithubWebhookType.INSTALLATION_REPOSITORIES in events | ||
| assert len(events) == 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test assertions have incorrect event count values
The test test_default_events_included asserts 6 specific items are in events (4 configurable + 2 installation events), then asserts len(events) == 5. This is mathematically impossible. Similarly, test_disabling_options_excludes_events with all options disabled leaves only 2 installation events, yet also asserts len(events) == 5. The first test likely needs len(events) == 6, and the second needs len(events) == 2. These tests will fail when executed, or if they pass, indicate the implementation has unexpected behavior.
Additional Locations (1)
| String values preserve original GitHub naming for backward compatibility with existing metrics. | ||
| """ | ||
|
|
||
| CI_CHECK = "ci_check" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Metric value change contradicts backward compatibility guarantee
The docstring at line 432 explicitly states "String values preserve original GitHub naming for backward compatibility with existing metrics." However, CI_CHECK = "ci_check" does NOT match GitHub's event type naming of "check_run" (as defined in GithubWebhookType.CHECK_RUN). This is inconsistent with other values like MERGE_REQUEST = "pull_request" which DO preserve GitHub naming. The diff shows this was previously CHECK_RUN = "check_run", so changing to "ci_check" breaks the documented backward compatibility and could cause existing metrics dashboards monitoring event_type=check_run to stop working.
| # These are always forwarded | ||
| assert GithubWebhookType.INSTALLATION in events | ||
| assert GithubWebhookType.INSTALLATION_REPOSITORIES in events | ||
| assert len(events) == 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test assertion expects wrong count for default events
The test_default_events_included test asserts that 6 distinct event types are present in the returned set (PULL_REQUEST, PULL_REQUEST_REVIEW, PULL_REQUEST_REVIEW_COMMENT, ISSUE_COMMENT, INSTALLATION, INSTALLATION_REPOSITORIES), but then incorrectly claims len(events) == 5. Given _INSTALLATION_EVENTS contains 2 items and _WEBHOOK_TYPE_TO_OPTION contains 4 items (all defaulting to True), the set contains 6 items.
Similarly, test_disabling_options_excludes_events disables all 4 configurable options, leaving only the 2 installation events, but asserts len(events) == 5 instead of len(events) == 2. Both tests will fail with AssertionError.
Additional Locations (1)
src/sentry/seer/code_review/utils.py
Outdated
| # Build RepoDefinition | ||
| repo_definition = { | ||
| "provider": "github", # All GitHub webhooks use "github" provider | ||
| "owner": repo.owner, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repository model lacks owner attribute accessed in transform function
The _transform_webhook_to_codegen_request function accesses repo.owner at line 101, but the Repository model has no owner field or property. The model only has name (which stores the full repository name like "owner/repo"), external_id, organization_id, and other fields. This will cause an AttributeError when handle_other_webhook_event is called. While currently not reached (the WEBHOOK_EVENT_PROCESSORS for ISSUE_COMMENT, PULL_REQUEST_REVIEW, etc. are empty), this will crash when those handlers are enabled. The owner likely needs to be extracted from repo.name.split("/")[0] or from the repository config.
| a task to forward the original run ID to Seer so it can rerun the PR review. | ||
| Args: | ||
| event_type: The type of the webhook event (as string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: The EventType enum value for CHECK_RUN is "ci_check", but GitHub sends "check_run", causing a ValueError during event parsing and breaking check_run webhook handling.
Severity: CRITICAL | Confidence: High
🔍 Detailed Analysis
The EventType enum defines CHECK_RUN with the value "ci_check". However, GitHub sends webhook events with the type "check_run". When the application receives a check_run event, the EventType.from_string("check_run") call in handle_webhook_event fails with a ValueError because no enum member has the value "check_run". This will cause the webhook handler to crash, preventing any check_run events from being processed.
💡 Suggested Fix
Change the value of EventType.CHECK_RUN in src/sentry/seer/code_review/webhooks/types.py back to "check_run" to match the string sent by the GitHub webhook API. This will ensure EventType.from_string("check_run") resolves correctly.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/seer/code_review/webhooks/check_run.py#L90
Potential issue: The `EventType` enum defines `CHECK_RUN` with the value `"ci_check"`.
However, GitHub sends webhook events with the type `"check_run"`. When the application
receives a `check_run` event, the `EventType.from_string("check_run")` call in
`handle_webhook_event` fails with a `ValueError` because no enum member has the value
`"check_run"`. This will cause the webhook handler to crash, preventing any `check_run`
events from being processed.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 7995728
| """ | ||
| # Determine request_type based on event_type | ||
| # For now, we only support pr-review for these webhook types | ||
| request_type: Literal["pr-review", "pr-closed"] = "pr-review" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: The _transform_webhook_to_codegen_request function hardcodes request_type to "pr-review", ignoring the actual event_type parameter, which breaks logic for events like closing a pull request.
Severity: CRITICAL | Confidence: High
🔍 Detailed Analysis
The function _transform_webhook_to_codegen_request is intended to determine the request_type based on the event_type parameter it receives. However, the implementation hardcodes the request_type to "pr-review" and ignores the event_type parameter. This means that events like a pull request closing, which should generate a "pr-closed" request type, will instead incorrectly generate a "pr-review" request. This causes the wrong information to be sent to the Seer service, breaking the intended functionality for handling different pull request events.
💡 Suggested Fix
Update _transform_webhook_to_codegen_request to use the event_type parameter to conditionally set the request_type. For example, check if the event is a pull request close event and set request_type to "pr-closed" accordingly.
🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/sentry/seer/code_review/utils.py#L83
Potential issue: The function `_transform_webhook_to_codegen_request` is intended to
determine the `request_type` based on the `event_type` parameter it receives. However,
the implementation hardcodes the `request_type` to `"pr-review"` and ignores the
`event_type` parameter. This means that events like a pull request closing, which should
generate a `"pr-closed"` request type, will instead incorrectly generate a `"pr-review"`
request. This causes the wrong information to be sent to the Seer service, breaking the
intended functionality for handling different pull request events.
Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 7995728
This fixes the `check_run` re-run case since I regressed it in #105393 when we switched the `event_type` to `ci_check`. Standardizes webhook event type handling by replacing the custom `EventType` enum with the existing `GithubWebhookType` from the GitHub integration package. Changes: - Pass `github_event` from `X-GitHub-Event` header through webhook chain - Remove custom `EventType` enum and use `GithubWebhookType` throughout - Update webhook processor signatures to accept `github_event` parameter - Simplify event routing logic in `process_github_webhook_event` task - Update all webhook processors and tests to use new signature This makes the code more consistent with the GitHub integration package and removes duplicate event type definitions.
Currently, we send GitHub webhook events to Overwatch which are processed and then Seer's called from it.
The changes included are preparatory changes to switch from sending the events from Overwatch to Seer.
This will enable porting one feature at a time.
This adds webhook handling for:
Fixes CW-86.