feat: enhance GitHub PR reading with code context and suggestions by TimeToBuildBob · Pull Request #689 · gptme/gptme

TimeToBuildBob · 2025-10-13T09:50:43Z

Implements all features from #688 to enhance GitHub PR reading capability.

Implemented Features

1. Code Context Snippets

Extracts diff_hunk from review comments via GitHub API
Displays 3-10 lines of code context around commented lines
Removes diff markers (@@, +, -) for clean display
Shows file path and line number references

2. Code Suggestions

Detects and extracts ```suggestion blocks from comment bodies
Formats suggestions with proper syntax highlighting
Displays as "Suggested change:" with code block

3. Better Organization (Already Exists)

Filters resolved comments automatically via GraphQL
Shows GitHub Actions status with emoji indicators
Groups unresolved comments clearly

Testing

Replaced all mocking with real GitHub API tests:

Tests against actual PR feat: implement basic lesson system with metadata and auto-inclusion #687 from gptme/gptme
Validates code context extraction
Validates suggestion extraction
All 5 tests pass with real API calls

Example Output

Review comments now show:

@Reviewer on test.py:10:
Please fix this

Referenced code in test.py:10:
Context:

def example():
    old_code
    new_code

Suggested change:

def example():
    fixed_code

Closes #688

Important

Enhances GitHub PR reading by adding code context and suggestion extraction, with comprehensive testing.

Features:
- Extracts diff_hunk from review comments via GitHub API in gh.py.
- Displays 3-10 lines of code context around commented lines, removing diff markers.
- Detects and extracts suggestion blocks from comment bodies, formatting them with syntax highlighting.
Functions:
- Adds _wait_for_checks() in gh.py to wait for GitHub Actions checks to complete.
- Implements execute_gh() in gh.py to handle pr view and pr checks commands.
Testing:
- Replaces mocking with real GitHub API tests in test_util_gh.py.
- Adds comprehensive mocked tests in test_util_gh_mocked.py for code context and suggestion extraction.
Misc:
- Updates instructions and examples in gh.py to reflect new functionality.

^{This description was created by}^{for 2097971. You can customize this summary. It will automatically update as commits are pushed.}

Enhances get_github_pr_content() to include code context from diff_hunk in review comments, making it easier to understand what code is being commented on. - Extract diff_hunk from review comment API responses - Format and display code context with syntax highlighting - Remove diff markers (@@, +, -) for cleaner display - Add comprehensive tests with mocked gh CLI responses Closes #688

- Extract and display code suggestions from review comment bodies - Rewrite tests to use real GitHub API calls instead of mocking - Test against actual PR #687 with review comments and suggestions - Remove mock dependencies from tests All tests pass with real API validation.

ellipsis-dev

Caution

Changes requested ❌

Reviewed everything up to af48db0 in 1 minute and 53 seconds. Click for details.

Reviewed 151 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/util/gh.py:349

Draft comment:
The suggestion extraction logic works but could be improved for clarity and robustness. For example, if a suggestion block is not properly closed, it will be silently ignored. Consider adding documentation here or using a regular expression to extract suggestion blocks more robustly, ensuring that edge cases (like unclosed suggestion fences) are handled gracefully.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment raises a valid point about edge cases in parsing suggestion blocks. The current implementation could silently fail if a suggestion block isn't properly closed. However, this is new code that handles a relatively simple parsing task, and the current implementation is clear and straightforward. The suggested improvements feel like premature optimization. The comment identifies a real edge case that could cause issues. Silent failures in parsing could lead to confusing behavior. While the edge case is real, the current implementation is simple and maintainable. GitHub's markdown is well-structured, making unclosed blocks rare, and the failure mode (silently skipping) is acceptable for this use case. The comment should be deleted as it suggests complexity for edge cases that are unlikely to occur, and the current implementation is sufficiently robust for its purpose.

2. tests/test_util_gh.py:41

Draft comment:
The integration tests in this file rely on live GitHub PR data. While this confirms end-to-end functionality, it may lead to flaky or slower CI runs. Consider adding mocked tests or a fallback mechanism to provide more stable and faster test execution in environments with network constraints.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the suggestion about mocked tests is reasonable, the current implementation already has mitigations in place: @pytest.mark.slow decorator for CI optimization and skip mechanism for network issues. The tests are specifically designed to test real GitHub API responses, and mocks might hide real integration issues. The current implementation seems intentional and well-thought-out. The comment raises valid concerns about test stability and CI performance. Mocked tests could provide faster feedback for basic functionality. The existing implementation already addresses these concerns through pytest markers and skip mechanisms. The real API tests are valuable for catching integration issues. Delete the comment. The current implementation is intentionally testing real GitHub API responses with appropriate safeguards in place.

Workflow ID: wflow_zCJaTyoqbyJuJCzS

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-10-13T09:52:40Z

gptme/util/gh.py

+                        # Add code context if available
+                        if diff_hunk:
+                            content += f"\nReferenced code in {path}:{line}:\n"
+                            content += "Context:\n```" + path.split(".")[-1] + "\n"


In the code context snippet block, consider handling cases where the file path does not include a standard extension. Currently, the language hint for the syntax highlighting is derived using 'path.split(".")[-1]', which might not produce the desired language identifier (e.g. for files like 'Makefile'). Consider providing a default language or additional logic to determine the correct language.

Suggested change

content += "Context:\n```" + path.split(".")[-1] + "\n"

content += "Context:\n```" + (path.split(".")[-1] if "." in path else "text") + "\n"

codecov · 2025-10-13T09:54:50Z

Codecov Report

❌ Patch coverage is 40.00000% with 84 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
gptme/tools/gh.py	19.80%	81 Missing ⚠️
gptme/util/gh.py	92.30%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

…ead_pr tool Comprehensive enhancement of GitHub PR reading capabilities: 1. Code Context Snippets (gh.py) - Extract diff_hunk from review comments - Display 3-10 lines of code context around commented lines - Remove diff markers for clean display 2. Code Suggestion Extraction (gh.py) - Detect and parse ```suggestion blocks from comment bodies - Format with syntax highlighting 3. New read_pr Tool Command (tools/gh.py) - Single command for agents to get full PR context - Syntax: gh read_pr <pr_url> - Returns all PR info: description, comments, review comments, suggestions - Proper error handling for invalid URLs 4. Comprehensive Testing - Real GitHub API tests (no mocking) - Tests against actual PR #687 - 7 tests covering all functionality - Proper type handling for ExecuteFunc Closes #688

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 803477a in 1 minute and 43 seconds. Click for details.

Reviewed 166 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 6 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/tools/gh.py:22

Draft comment:
Consider adding an explicit check for an empty URL after retrieving it from args or kwargs. Right now, if the URL is an empty string, the code will proceed to call get_github_pr_content('') and then yield an error later. A clear check (e.g. if not url:) immediately after line 25 might improve readability and avoid unnecessary processing.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The current error handling is actually quite robust. An empty URL will fail parse_github_url() and result in a clear error message. Adding an early check would just duplicate error handling. The current flow provides more specific error messages by letting the parsing functions handle validation. The comment has a point about avoiding unnecessary processing, and early validation is often good practice. However, the current error handling is already well-structured and provides better user feedback by letting the URL parsing functions handle validation. The "unnecessary processing" is negligible. The comment suggests a change that would not meaningfully improve the code and would actually reduce the quality of error messages.

2. gptme/tools/gh.py:36

Draft comment:
The error message for an invalid GitHub URL is clear, but you might consider including a brief note on the expected URL format for additional clarity. This helps reinforce the URL requirements directly in the error message.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

3. gptme/tools/gh.py:43

Draft comment:
For the case where fetching PR content fails (lines 43–46), the error message advises the user to check that the 'gh' CLI is installed and authenticated. This is good; however, in the future as more commands might be added, consider including a list of available subcommands in the error message instead of hard-coding just 'read_pr'.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

4. gptme/tools/gh.py:51

Draft comment:
The instructions and examples are very well documented. You might also want to include a brief note about ensuring that the gh CLI is properly authenticated to avoid confusion when the PR content fails to load.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

5. tests/test_util_gh.py:42

Draft comment:
Tests that use real GitHub API calls are useful for integration testing. However, be aware that these tests may become flaky if external conditions change. Consider marking these tests clearly as integration tests or exploring ways to isolate API calls in future iterations.
Reason this comment was not posted:
Confidence changes required: 30% <= threshold 50% None

6. tests/test_util_gh.py:137

Draft comment:
The invalid URL test is well-constructed. It checks both for the presence of an error message and the expected error detail. Good job ensuring that error paths are tested appropriately.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_o1Mxae3heo0G885J

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Makes the command more intuitive by following the standard gh CLI naming: - Old: gh read_pr <url> - New: gh pr view <url> This matches the native gh CLI command while providing enhanced output with code context, suggestions, and comprehensive review information.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 42d615f in 1 minute and 42 seconds. Click for details.

Reviewed 67 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 4 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/tools/gh.py:20

Draft comment:
The updated command check using 'pr view' is clear and aligns with the new design. Consider adding a deprecation notice or fallback for the old 'read_pr' command if backward compatibility is a concern.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This appears to be a deliberate API change to align with standard GitHub CLI commands. The old command was likely internal-only and not part of a public API. The examples and documentation are all updated to use the new format. Adding backward compatibility would add complexity for questionable benefit. I might be wrong about the API stability requirements. There could be existing users relying on the old command format. The code appears to be part of an internal tool (gptme/tools), and the change is comprehensive across docs and examples. Breaking changes are acceptable in internal tools. The comment should be deleted as it's making a speculative suggestion about backward compatibility without clear evidence that it's needed.

2. gptme/tools/gh.py:22

Draft comment:
The logic checking for a third argument (URL) using 'if len(args) > 2:' is appropriate for the new 'gh pr view ' syntax. You might consider trimming the URL to avoid issues with extra whitespace.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% The first part of the comment just acknowledges the correct implementation. The second part about trimming URLs is speculative - there's no evidence of whitespace issues in the code or tests. The URL handling seems to work fine as is, and any URL parsing issues would be caught by parse_github_url() later. Maybe there could be edge cases where users copy-paste URLs with whitespace that could cause issues? The parse_github_url() function's behavior with whitespace isn't visible here. Without evidence of actual whitespace-related bugs or seeing parse_github_url()'s implementation, this is exactly the kind of speculative "what if" comment we should avoid. Delete this comment. The first part is just acknowledging correct code, and the URL trimming suggestion is speculative without evidence of actual issues.

3. tests/test_util_gh.py:107

Draft comment:
The test 'test_gh_tool_read_pr' now correctly uses the updated command syntax ['pr', 'view', '']. This ensures consistency across the CLI usage examples.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is purely informative, as it only states that a test now uses an updated command syntax. It doesn't provide any actionable feedback or suggestions for improvement.

4. tests/test_util_gh.py:137

Draft comment:
The invalid URL test has been updated to use the 'pr view' command and expects an error message containing 'Invalid GitHub URL'. This is correct. You might also consider tests for cases where the URL argument is completely omitted.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_RtQbUKijLWt4u7q1

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Adds a new command to wait for all GitHub Actions checks to complete: - gh pr checks <pr_url> Features: - Polls check status every 10 seconds - Shows status updates only when status changes (less spammy) - Displays counts with emoji indicators (✅ passed, ❌ failed, 🔄 in progress) - Exits when all checks complete - Shows final summary of results Much cleaner than "gh run watch" and monitors all checks at once.

…uggestions - Created 8 mocked tests covering all new functionality - Tests code context extraction from diff_hunk - Tests code suggestion extraction from comment bodies - Tests edge cases (empty diff_hunk, multiple suggestions, malformed blocks) - Tests file extension extraction for syntax highlighting - Combined with existing tests achieves 73% coverage (up from 0% patch coverage) Addresses Codecov feedback on PR #689

ellipsis-dev

Caution

Changes requested ❌

Reviewed 19840c9 in 2 minutes and 3 seconds. Click for details.

Reviewed 220 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 4 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/tools/gh.py:45

Draft comment:
The poll interval is hardcoded to 10 seconds. It might be beneficial to make this interval configurable (e.g. via an optional parameter) so that developers and users can adjust it based on their environment or testing needs.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

2. gptme/tools/gh.py:147

Draft comment:
The try/except block (lines 147–152) rightly catches exceptions from subprocess calls and JSON parsing. Consider whether a timeout or cancellation option should be added to prevent indefinite waiting if the check run service becomes unresponsive.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

3. gptme/tools/gh.py:193

Draft comment:
In the 'pr view' branch (lines 183–196), the function get_github_pr_content is used to display PR content. Please ensure that this function has been updated to extract the new code context snippets and code suggestions as described in the PR details, since these features are core to the enhancement.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

4. gptme/tools/gh.py:217

Draft comment:
The updated instructions and examples (lines 217–266) provide clear usage guidance for both the 'pr view' and the new 'pr checks' commands. This clarity should help users understand the new features effectively.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_DtCXUm5gj06VaMjr

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-10-13T10:10:57Z

gptme/tools/gh.py

+                conclusion = run.get("conclusion")
+
+                if status == "completed":
+                    state = conclusion or "completed"


When processing check run statuses (lines 78–86), the code sets the state for completed runs using conclusion or 'completed'. Note that if conclusion is None (or another unmapped status such as 'neutral' or 'timed_out'), it defaults to 'completed', which isn’t defined in the status_counts dictionary. Consider explicitly handling such cases (for example, defaulting to 'success' or mapping additional statuses) to ensure all outcomes are correctly reflected.

Suggested change

state = conclusion or "completed"

state = conclusion or "success"

ellipsis-dev

Caution

Changes requested ❌

Reviewed 338b762 in 1 minute and 59 seconds. Click for details.

Reviewed 521 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. tests/test_util_gh_mocked.py:509

Draft comment:
In test_malformed_suggestion_block, the assertion accepting either the raw suggestion marker or the unclosed block seems ambiguous. Clarifying the expected behavior for malformed suggestion blocks would make the test more robust and the behavior more predictable.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

2. tests/test_util_gh_mocked.py:210

Draft comment:
The command matching logic in the mock_subprocess_run function relies on substring checks (e.g., '--comments' in cmd). Although functional for the tests, refactoring this logic to use more explicit flag checks might make the intent clearer and reduce potential brittleness if command formats change.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_luc51DlMdDwRPbtm

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev · 2025-10-13T10:11:47Z

tests/test_util_gh_mocked.py

+            for line in lines:
+                # Check that code lines don't start with diff markers
+                if line and not line.startswith("```"):
+                    assert not line.startswith(


In test_code_context_extraction, while the test correctly checks that '@@' markers are removed from the diff context, it would improve coverage to also assert that '+' and '-' diff markers are stripped from the output. This would ensure the context snippet is fully cleaned as per the requirements.

Suggested change

assert not line.startswith(

assert not line.startswith(("@@", "+", "-")), f"Line should not have diff marker: {line}"

- Handle files without extensions (e.g., Makefile) by defaulting to "text" lang - Map unmapped check conclusions to "success" instead of undefined "completed" Addresses review comments from ellipsis-dev bot.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 3c9390e in 1 minute and 43 seconds. Click for details.

Reviewed 29 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 3 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/tools/gh.py:82

Draft comment:
In _wait_for_checks, the mapping for completed check runs now defaults to 'success' when the conclusion is not found in status_counts. Since GitHub can return additional conclusion values (e.g., 'neutral', 'timed_out', 'action_required'), consider expanding status_counts or adjusting the mapping logic to handle these cases explicitly, ensuring consistency across the tool.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment raises a valid point about GitHub's API potentially returning other conclusion values. However, the current code's approach of defaulting to "success" seems intentional - if a check run is completed and its conclusion isn't one we explicitly track, treating it as a success is reasonable. Changing to "skipped" would be more conservative but could lead to misrepresenting actually successful checks. Without more context about the tool's requirements or evidence that this is causing issues, the current behavior seems acceptable. I might be underestimating the importance of accurately representing non-standard conclusion states. There could be cases where treating an unknown conclusion as "success" leads to false positives. While being conservative with unknown states has merit, the current code's behavior is explicit and reasonable. The comment doesn't provide strong evidence that the current approach is problematic. Delete the comment. While it raises an interesting point, it's more of a suggestion for potential improvement rather than a clear issue that needs fixing.

2. gptme/util/gh.py:335

Draft comment:
When determining the language for the code context snippet, the current approach uses path.split('.')[-1] if a dot exists in the path. This can be fragile if directory names contain dots. Consider using a more robust method (e.g., using os.path.splitext) to extract the file extension and default to 'text' when none is available.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% While the suggestion is technically correct that os.path.splitext() is more robust, this is a very low-risk context - we're just trying to get a language hint for markdown code blocks. If it fails, the worst case is incorrect syntax highlighting. The current code is simple and works for 99% of cases. The suggested change adds complexity without meaningful benefit. The comment correctly identifies a potential edge case with dots in directory names. The suggested fix would handle those cases more robustly. Given this is just for markdown syntax highlighting hints, the edge cases don't matter enough to justify the more complex code. The current simple approach is good enough for this use case. Delete this comment as it suggests a more complex solution for a simple use case where the current code is good enough and edge cases don't meaningfully impact functionality.

3. gptme/util/gh.py:350

Draft comment:
The new block for extracting and displaying code suggestions by scanning for '```suggestion' is a clear start. To improve robustness, consider using a regex-based approach to capture the suggestion blocks more reliably, handling potential extra text on the opening delimiter and multiple suggestion blocks within a single comment.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_jSkD5KNrEeIFzLag

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Display "path:start-end" for multi-line comments instead of just "path:end" to accurately reflect the lines being commented on, matching GitHub UI. Example: gptme/tools/lessons.py:138-155 instead of :155

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 3c1485f in 49 seconds. Click for details.

Reviewed 27 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/util/gh.py:327

Draft comment:
The updated multi-line reference formatting improves clarity. Consider explicitly checking if start_line is not None (e.g. if start_line is not None and start_line != line:) instead of using a truthy check. This ensures that valid values like 0 are handled correctly and that start_line and line are compared consistently (potentially after type conversion if needed).
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_WtgNMP86eyyfY5Kq

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

Change type from 'str' to 'string' to match JSON schema specification. Fixes OpenAI API schema validation error.

Skip test_gh_tool_read_pr when gh CLI is not authenticated, which is common in CI environments. This prevents false failures when gh is installed but not configured with credentials.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed d8e6f19 in 54 seconds. Click for details.

Reviewed 13 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/tools/gh.py:280

Draft comment:
Change the url parameter type from str to string. This update improves consistency with our tool parameter conventions. Ensure that using 'string' aligns with how type descriptors are handled elsewhere in the project and won't affect components expecting Python's built-in str.
Reason this comment was not posted:
Confidence changes required: 0% <= threshold 50% None

Workflow ID: wflow_IMivVsXssDYX6xH5

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed c74ceeb in 42 seconds. Click for details.

Reviewed 19 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. tests/test_util_gh.py:121

Draft comment:
The newly added conditional check to skip the test when the gh CLI isn’t authenticated is a practical solution for CI environments. However, relying on specific error message substrings ("Failed to fetch PR content" and "Make sure 'gh' CLI is installed") may be fragile if these messages ever change. Consider using a more robust mechanism (for example, checking an environment variable or a dedicated error code) to determine CLI authentication status for better long-term maintainability.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 50% None

Workflow ID: wflow_jNSxDe4HZSkir6Lf

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

GitHub API returns null for 'line' and 'start_line' when comments are on old commits. Use 'original_line' and 'original_start_line' as fallbacks to show the line numbers where the comment was originally made.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 2097971 in 1 minute and 58 seconds. Click for details.

Reviewed 30 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. gptme/util/gh.py:323

Draft comment:
Improvement: The updated code now correctly falls back to 'original_line' and 'original_start_line' when 'line' or 'start_line' are missing, which is a good enhancement. Consider ensuring that these values are of the expected type (e.g. int) and not inadvertently falsy (for example, a valid line number 0) when using truthy checks.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

2. gptme/util/gh.py:334

Draft comment:
Suggestion: The conditional now checks 'if line and start_line and start_line != line' which improves handling for multi-line comments. It might be clearer to explicitly check for None (e.g. 'if line is not None and start_line is not None') so that valid numeric values like 0 aren’t misinterpreted as missing.
Reason this comment was not posted:
Confidence changes required: 50% <= threshold 50% None

Workflow ID: wflow_l6wolROF7Amsp4ed

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

openhands-ai · 2025-10-13T10:37:31Z

Looks like there are a few issues preventing this PR from being merged!

GitHub Actions are failing:
- Test

If you'd like me to help, just leave a comment, like

@OpenHands please fix the failing actions on PR #689 at branch `feat/gh-pr-code-context`

Feel free to include any additional details that might help me get this PR into a better state.

_{^{You can manage your notification settings}}

…tme#689) * feat: add code context snippets to GitHub PR review comments Enhances get_github_pr_content() to include code context from diff_hunk in review comments, making it easier to understand what code is being commented on. - Extract diff_hunk from review comment API responses - Format and display code context with syntax highlighting - Remove diff markers (@@, +, -) for cleaner display - Add comprehensive tests with mocked gh CLI responses Closes gptme#688 * feat: add code suggestion extraction and real API tests - Extract and display code suggestions from review comment bodies - Rewrite tests to use real GitHub API calls instead of mocking - Test against actual PR gptme#687 with review comments and suggestions - Remove mock dependencies from tests All tests pass with real API validation. * feat: enhance GitHub PR reading with code context, suggestions, and read_pr tool Comprehensive enhancement of GitHub PR reading capabilities: 1. Code Context Snippets (gh.py) - Extract diff_hunk from review comments - Display 3-10 lines of code context around commented lines - Remove diff markers for clean display 2. Code Suggestion Extraction (gh.py) - Detect and parse ```suggestion blocks from comment bodies - Format with syntax highlighting 3. New read_pr Tool Command (tools/gh.py) - Single command for agents to get full PR context - Syntax: gh read_pr <pr_url> - Returns all PR info: description, comments, review comments, suggestions - Proper error handling for invalid URLs 4. Comprehensive Testing - Real GitHub API tests (no mocking) - Tests against actual PR gptme#687 - 7 tests covering all functionality - Proper type handling for ExecuteFunc Closes gptme#688 * refactor: rename read_pr to "gh pr view" to match gh CLI convention Makes the command more intuitive by following the standard gh CLI naming: - Old: gh read_pr <url> - New: gh pr view <url> This matches the native gh CLI command while providing enhanced output with code context, suggestions, and comprehensive review information. * feat: add "gh pr checks" to wait for CI completion Adds a new command to wait for all GitHub Actions checks to complete: - gh pr checks <pr_url> Features: - Polls check status every 10 seconds - Shows status updates only when status changes (less spammy) - Displays counts with emoji indicators (✅ passed, ❌ failed, 🔄 in progress) - Exits when all checks complete - Shows final summary of results Much cleaner than "gh run watch" and monitors all checks at once. * test: add comprehensive mocked tests for GitHub PR code context and suggestions - Created 8 mocked tests covering all new functionality - Tests code context extraction from diff_hunk - Tests code suggestion extraction from comment bodies - Tests edge cases (empty diff_hunk, multiple suggestions, malformed blocks) - Tests file extension extraction for syntax highlighting - Combined with existing tests achieves 73% coverage (up from 0% patch coverage) Addresses Codecov feedback on PR gptme#689 * fix: address review comments from ellipsis-dev - Handle files without extensions (e.g., Makefile) by defaulting to "text" lang - Map unmapped check conclusions to "success" instead of undefined "completed" Addresses review comments from ellipsis-dev bot. * fix: show line ranges for multi-line review comments Display "path:start-end" for multi-line comments instead of just "path:end" to accurately reflect the lines being commented on, matching GitHub UI. Example: gptme/tools/lessons.py:138-155 instead of :155 * fix: use correct JSON schema type for gh tool parameter Change type from 'str' to 'string' to match JSON schema specification. Fixes OpenAI API schema validation error. * fix: handle gh CLI authentication errors in tests Skip test_gh_tool_read_pr when gh CLI is not authenticated, which is common in CI environments. This prevents false failures when gh is installed but not configured with credentials. * fix: use original_line as fallback for review comment line numbers GitHub API returns null for 'line' and 'start_line' when comments are on old commits. Use 'original_line' and 'original_start_line' as fallbacks to show the line numbers where the comment was originally made. --------- Co-authored-by: Erik Bjäreholt <erik@bjareho.lt>

ErikBjare added 2 commits October 13, 2025 11:43