Skip to content

[Pyrefly][Github actions] Add Two-pass LLM classification with PR diff attribution for primer classification#2539

Closed
migeed-z wants to merge 6 commits intomainfrom
two-pass-llm-classification
Closed

[Pyrefly][Github actions] Add Two-pass LLM classification with PR diff attribution for primer classification#2539
migeed-z wants to merge 6 commits intomainfrom
two-pass-llm-classification

Conversation

@migeed-z
Copy link
Contributor

@migeed-z migeed-z commented Feb 24, 2026

This is another iteration on our mypy primer classifier work. There are a few bugs and improvements we can make. Specifically

  • The verdict contradicts the message.
    Solution: Separate the concerns. One pass for analyzing the diff and coming up with the message, and then a light weight pass to read the message and determine the verdict.
  • Include PR information to explain how they contributed to those changes
  • linkify and improve formatting for messages. Now we have a table that describes the errors per project, as well as a high level overall comment on next step suggestions

…ifier

Split LLM classification into two passes to fix verdict-reasoning
contradictions (4/26 in PR #2493). Pass 1 produces reasoning and
PR attribution without a verdict. Pass 2 reads the reasoning and
assigns the verdict. This separates code analysis (hard) from
labeling (easy), eliminating cases where the LLM commits to a
verdict early and writes contradictory reasoning.

Also adds --pyrefly-diff CLI flag to include the pyrefly PR code
diff in each LLM call, enabling per-project attribution of which
code change caused errors to appear or disappear.
@meta-cla meta-cla bot added the cla signed label Feb 24, 2026
@migeed-z migeed-z marked this pull request as draft February 24, 2026 22:35
@migeed-z migeed-z changed the title Two-pass LLM classification with PR diff attribution for primer class… [Pyrefly][Github actions] Add Two-pass LLM classification with PR diff attribution for primer classification Feb 24, 2026
@meta-codesync
Copy link

meta-codesync bot commented Feb 24, 2026

@migeed-z has imported this pull request. If you are a Meta employee, you can view this in D94280120.

Restructure format_markdown() to show an overview table with linked
function names and file paths, collapsible detailed analysis, and a
suggested fix section. Add helpers for function-name linkification
and root cause extraction from PR attribution text.
Add --suggest CLI flag, Suggestion/SuggestionResult dataclasses, and
generate_suggestions() LLM client that produces actionable source code
fix suggestions from classification results and the PR diff.
@migeed-z migeed-z force-pushed the two-pass-llm-classification branch 4 times, most recently from b97a4f2 to 60290de Compare February 26, 2026 01:02
Use a stricter regex (_INTERNAL_FUNCTION_PATTERN) that requires
underscores to distinguish pyrefly internal function names like
check_for_imported_final_reassignment() from common Python method
names like get(), match(), set() that appear in error messages.
@migeed-z migeed-z force-pushed the two-pass-llm-classification branch from 60290de to 2ea9d73 Compare February 26, 2026 01:10
Asks reviewers to react with 👍 or 👎 so we can track
classifier accuracy over time.
…e display

- Collect error_kinds from both project.added and project.removed so
  improvement-only diffs (all removals) still populate the Error Kinds column
- Add file path fallback in _extract_root_cause when no function name is found
- Add high-level summary paragraph aggregating patterns across projects
- Rename table header from "Error Kind" to "Error Kinds"
Copy link
Contributor

@yangdanny97 yangdanny97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review automatically exported from Phabricator review in Meta.

@meta-codesync
Copy link

meta-codesync bot commented Feb 26, 2026

@migeed-z merged this pull request in ad25a9a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants