feat(skills): Add codegraph skill for PR review risk analysis and conflict detection#3910
Conversation
|
@BingqingLyu Thanks for your contribution! I checkout your branch and tested this skill, but it's currently stuck at downloading the Hugging Face mirror. |
| def flush(self): | ||
| original_stdout.flush() | ||
| fout.flush() | ||
| @ |
There was a problem hiding this comment.
[Critical] The stdout Tee pattern snippet includes a standalone @, so anyone copying this example will get a Python SyntaxError before the analysis runs.
| @ | |
| sys.stdout = Tee() |
— gpt-5.5 via Qwen Code /review
There was a problem hiding this comment.
Fixed — the stray @ has been removed from the stdout Tee pattern snippet.
| ```cypher | ||
| -- usecase1: PR → hunk functions | ||
| MATCH (pr:PR) | ||
| OPTIONAL MATCH (pr)-[r:CHANGES]->(f:Function) |
There was a problem hiding this comment.
[Critical] These copy-paste Cypher examples use OPTIONAL MATCH, but the schema reference added in this PR says the pip package does not support OPTIONAL MATCH. Users following this PR-analysis workflow against the documented package will hit query syntax failures. Please rewrite these examples with supported MATCH patterns, or clearly mark them as unsupported pseudocode and provide runnable alternatives.
— gpt-5.5 via Qwen Code /review
There was a problem hiding this comment.
Acknowledged. OPTIONAL MATCH is now supported in the current pip package version — I've removed the outdated limitation from schema.md so the examples are consistent with the released package.
|
|
||
| The workflow covers: fetching all open PRs via GitHub API → fetching branches → building the code graph index → computing multi-dimensional risk signals per PR → writing a prioritized report to a file. | ||
|
|
||
| For the decision framework on how to interpret results and prioritize PRs, see `PRReview.md` in the project root. |
There was a problem hiding this comment.
[Suggestion] This points users to PRReview.md, but that file is not present in the project. The PR-analysis workflow depends on this decision framework for interpreting risk levels, so this becomes a dead reference for users. Please add the missing document, update the reference to an existing file, or inline the framework here.
— gpt-5.5 via Qwen Code /review
There was a problem hiding this comment.
Fixed — the dead reference to PRReview.md has been removed. The risk level thresholds and decision framework are already documented in the "Risk level thresholds" section of this same file.
| list(cs.conn.execute("MATCH (c:Commit) WHERE c.version_tag = 'bf' RETURN count(c)")) | ||
| ``` | ||
|
|
||
| If no commits exist, evolution methods will return empty results — guide the user to run `codegraph ingest` first. If commits exist but aren't backfilled, `TOUCHES` (file-level) queries still work but `MODIFIES` (function-level) queries won't. |
There was a problem hiding this comment.
[Suggestion] This recovery path tells users to run codegraph ingest, but the new skill docs otherwise document commit ingestion through codegraph init --repo ... --commits ..., and this PR does not document an ingest command. That makes the troubleshooting path likely to send users to an unverified or stale command. Please switch this to the documented init/backfill command, or add verified codegraph ingest documentation if that command is valid.
— gpt-5.5 via Qwen Code /review
There was a problem hiding this comment.
Fixed — I've now documented the codegraph ingest command in the Getting Started section (with --commits and --backfill-limit flags), so the troubleshooting reference is no longer stale.
| @@ -0,0 +1,632 @@ | |||
| --- | |||
| name: codegraph-qa | |||
There was a problem hiding this comment.
[Suggestion] The skill lives under .qwen/skills/codegraph/ and the PR advertises a codegraph skill, but the frontmatter name is codegraph-qa and the eval file uses the same skill_name. This mismatch makes the skill harder to discover/invoke under the advertised name and can validate the wrong routing name in evals. Please make the skill name consistent with the advertised codegraph name, or update the docs/PR to use codegraph-qa consistently.
— gpt-5.5 via Qwen Code /review
| @@ -0,0 +1,632 @@ | |||
| --- | |||
| name: codegraph-qa | |||
| description: Analyze indexed codebases via graph database (neug) and vector index (zvec). Covers call graphs, dependencies, dead code, hotspots, module coupling, architecture reports, semantic search, impact analysis, bug root cause from GitHub issues, class diagrams (UML), and PR review (risk scoring, conflict detection, auto-merge candidates, labeling). Use for: code structure, who calls what, why something changed, similar functions, module boundaries, bug tracing, class relationships, PR risk/conflicts, or any question benefiting from a code knowledge graph. Applies when a `.codegraph` index exists in the workspace. | |||
There was a problem hiding this comment.
[Suggestion] The frontmatter says this applies when a .codegraph index already exists, but the skill body also covers first-time index creation and index management. A user asking to create the initial index in a repo without .codegraph may not route to this skill even though the body contains the intended instructions. Please broaden the description to include creating, inspecting, repairing, or using a CodeGraph/CodeScope index.
— gpt-5.5 via Qwen Code /review
There was a problem hiding this comment.
Fixed — the description now explicitly covers creating, inspecting, and repairing a CodeScope index, and the trigger condition includes repos where the user wants to create one (not just existing .codegraph directories).
|
|
||
| DB_DIR = '/path/to/repo/.codegraph' # adjust to your repo | ||
| cs = CodeScope(DB_DIR) | ||
| cross = CrossPRAnalyzer(cs) |
There was a problem hiding this comment.
[Suggestion] Please verify this constructor call against the released codegraph-ai API. If CrossPRAnalyzer requires a repository path, this example will fail with a missing repo_dir argument before any PR graph insertion can run. In that case, pass the repo path explicitly, for example CrossPRAnalyzer(cs, repo_dir='/path/to/repo').
— gpt-5.5 via Qwen Code /review
There was a problem hiding this comment.
Fixed — both CrossPRAnalyzer examples now pass repo_dir=REPO_DIR explicitly.
| codegraph explore --db .codegraph --type pr-review --role reviewer | ||
|
|
||
| # Query a specific PR's conflicts via Cypher: | ||
| codegraph query "MATCH (pr1:PR {id: '439'})-[c1:CHANGES {info: 'hunk'}]->(f:Function)<-[c2:CHANGES {info: 'hunk'}]-(pr2:PR) WHERE pr1.id <> pr2.id RETURN pr2.id, f.name, f.file_path" |
There was a problem hiding this comment.
[Suggestion] Please verify that codegraph query is actually a raw-Cypher executor. If the released CLI treats query as a natural-language QA entry point, this command will not reliably execute the intended conflict query and users may get retrieval output instead of graph rows. If raw Cypher is only available through Python, replace this with a cs.conn.execute(...) snippet or document the correct raw-query interface.
— gpt-5.5 via Qwen Code /review
There was a problem hiding this comment.
Fixed — codegraph query is indeed a natural-language QA entry point, not a raw Cypher executor. The routing table and follow-up exploration section have been updated to use the PRReview Python API (conflict_prs_of(), cs.conn.execute()) for programmatic Cypher queries instead.
901afa1 to
ad8336a
Compare
- Rename skill from codegraph-qa to codegraph for consistency - Broaden description to cover index creation and inspection - Add HuggingFace/ModelScope model download tips - Document codegraph ingest command for adding git history - Fix stray '@' in stdout Tee pattern code snippet - Remove dead reference to PRReview.md - Add repo_dir parameter to CrossPRAnalyzer examples - Update CHANGES edge types (hunk/deleted/related/new) and resolve_pr_functions internals - Add Conflict Detection Dimensions table - Replace codegraph query with PRReview Python API examples - Remove OPTIONAL MATCH limitation from schema (now supported) - Update codegraph-ai link to PyPI - Add HF model download entry to Troubleshooting table
…flict detection Add .qwen/skills/codegraph/ with PR analysis, bug analysis, schema, patterns, and eval support. Enables per-PR risk scoring (blast radius, interface changes, test coverage), cross-PR conflict detection, and automated GitHub labeling via the codegraph-ai pip package.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…el downloads Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
- Rename skill from codegraph-qa to codegraph for consistency - Broaden description to cover index creation and inspection - Add HuggingFace/ModelScope model download tips - Document codegraph ingest command for adding git history - Fix stray '@' in stdout Tee pattern code snippet - Remove dead reference to PRReview.md - Add repo_dir parameter to CrossPRAnalyzer examples - Update CHANGES edge types (hunk/deleted/related/new) and resolve_pr_functions internals - Add Conflict Detection Dimensions table - Replace codegraph query with PRReview Python API examples - Remove OPTIONAL MATCH limitation from schema (now supported) - Update codegraph-ai link to PyPI - Add HF model download entry to Troubleshooting table
fe243d2 to
2daa6ff
Compare
Just for the Qwen Code project to help classify PRs.Also fixed a few things from the reviews.
…flict detection (#3910) * feat(skills): add codegraph skill for PR review risk analysis and conflict detection Add .qwen/skills/codegraph/ with PR analysis, bug analysis, schema, patterns, and eval support. Enables per-PR risk scoring (blast radius, interface changes, test coverage), cross-PR conflict detection, and automated GitHub labeling via the codegraph-ai pip package. * chore: ignore .venv and .codegraph directories Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * docs(skills): add HF mirror and ModelScope fallback for embedding model downloads Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(skills): address PR #3910 review feedback for codegraph skill - Rename skill from codegraph-qa to codegraph for consistency - Broaden description to cover index creation and inspection - Add HuggingFace/ModelScope model download tips - Document codegraph ingest command for adding git history - Fix stray '@' in stdout Tee pattern code snippet - Remove dead reference to PRReview.md - Add repo_dir parameter to CrossPRAnalyzer examples - Update CHANGES edge types (hunk/deleted/related/new) and resolve_pr_functions internals - Add Conflict Detection Dimensions table - Replace codegraph query with PRReview Python API examples - Remove OPTIONAL MATCH limitation from schema (now supported) - Update codegraph-ai link to PyPI - Add HF model download entry to Troubleshooting table --------- Co-authored-by: pomelo-nwu <czynwu@outlook.com> Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>


Related Issues
Closes #3906
What does this PR do?
Introduces a new Qoder skill —
codegraph— that enables automated PR risk assessment and cross-PR conflict detection using thecodegraph-aipip package. This helps maintainers and reviewers prioritize open PRs and identify hidden semantic conflicts between them.What changes in this PR?
New Files in
.qwen/skills/codegraph/SKILL.mdpr-analysis.mdbug-analysis.mdpatterns.mdschema.mdevals/evals.jsonKey Capabilities
Per-PR Risk Scoring — Computes multi-dimensional risk scores (blast radius, interface changes, test coverage gaps, config modifications, dead code) with levels: CRITICAL / HIGH / MEDIUM / LOW.
Cross-PR Conflict Detection — Identifies PRs that modify the same functions or share call paths via graph connected-component analysis.
Automated Labeling — Applies GitHub labels (
auto-merge-candidate,independent-review,conflicting-pr,conflicting-group-N) based on analysis results.Usage
Prerequisites
codegraph-aipip package (installed in isolated venv)gh auth loginfor GitHub API access