Unwrap 72-character wrapped commit message lines when creating PRs#8255
Unwrap 72-character wrapped commit message lines when creating PRs#8255
Conversation
Co-authored-by: alexr00 <38270282+alexr00@users.noreply.github.com>
- Add unwrapCommitMessageBody function to join wrapped lines - Preserve blank lines, list items, quotes, and indented content - Add comprehensive tests for various unwrapping scenarios - Matches GitHub's behavior when converting commit messages to PR descriptions Co-authored-by: alexr00 <38270282+alexr00@users.noreply.github.com>
- Extract regex pattern to named constant PRESERVE_LINE_PATTERN - Replace deprecated substr() with slice() - Improve regex to match only spaces/tabs explicitly instead of \s - Add detailed comments explaining the pattern Co-authored-by: alexr00 <38270282+alexr00@users.noreply.github.com>
- Place hyphen at end of character class for clarity - Extract hasBody condition to avoid duplication Co-authored-by: alexr00 <38270282+alexr00@users.noreply.github.com>
justingrant
left a comment
There was a problem hiding this comment.
Great to see this PR, but I see a few problems.
| if (!body) { | ||
| return body; | ||
| } | ||
|
|
||
| // Pattern to detect lines that should be preserved (not joined): | ||
| // - Lines starting with whitespace (indented/code blocks) | ||
| // - Lines starting with list markers (*, -, +, >) | ||
| // - Lines starting with numbered list items (e.g., "1. ") | ||
| const PRESERVE_LINE_PATTERN = /^[ \t*+>\-]|^\d+\./; | ||
|
|
||
| const lines = body.split('\n'); | ||
| const result: string[] = []; | ||
| let i = 0; | ||
|
|
||
| while (i < lines.length) { | ||
| const line = lines[i]; | ||
|
|
||
| // Preserve blank lines | ||
| if (line.trim() === '') { | ||
| result.push(line); | ||
| i++; | ||
| continue; | ||
| } | ||
|
|
||
| // Check if this line should NOT be joined with the next | ||
| // Lines that start with special formatting characters should be preserved | ||
| const shouldPreserveLine = PRESERVE_LINE_PATTERN.test(line); | ||
|
|
||
| if (shouldPreserveLine) { | ||
| result.push(line); | ||
| i++; | ||
| continue; | ||
| } | ||
|
|
||
| // Start accumulating lines that should be joined | ||
| let joinedLine = line; | ||
| i++; | ||
|
|
||
| // Keep joining lines until we hit a blank line or a line that shouldn't be joined | ||
| while (i < lines.length) { | ||
| const nextLine = lines[i]; | ||
|
|
||
| // Stop at blank lines | ||
| if (nextLine.trim() === '') { | ||
| break; | ||
| } | ||
|
|
||
| // Stop at lines that start with special formatting | ||
| if (PRESERVE_LINE_PATTERN.test(nextLine)) { | ||
| break; | ||
| } | ||
|
|
||
| // Join this line with a space | ||
| joinedLine += ' ' + nextLine; | ||
| i++; | ||
| } | ||
|
|
||
| result.push(joinedLine); | ||
| } | ||
|
|
||
| return result.join('\n'); |
There was a problem hiding this comment.
A few problems with your code:
- Having two nested loops is unnecessarily complicated.
- You don't support code blocks that are fenced with ```
- Detection of indented code is flawed because it doesn't account for nested list items or list item continuations that have 4+ spaces at the start of the line.
Here's a revised version:
| if (!body) { | |
| return body; | |
| } | |
| // Pattern to detect lines that should be preserved (not joined): | |
| // - Lines starting with whitespace (indented/code blocks) | |
| // - Lines starting with list markers (*, -, +, >) | |
| // - Lines starting with numbered list items (e.g., "1. ") | |
| const PRESERVE_LINE_PATTERN = /^[ \t*+>\-]|^\d+\./; | |
| const lines = body.split('\n'); | |
| const result: string[] = []; | |
| let i = 0; | |
| while (i < lines.length) { | |
| const line = lines[i]; | |
| // Preserve blank lines | |
| if (line.trim() === '') { | |
| result.push(line); | |
| i++; | |
| continue; | |
| } | |
| // Check if this line should NOT be joined with the next | |
| // Lines that start with special formatting characters should be preserved | |
| const shouldPreserveLine = PRESERVE_LINE_PATTERN.test(line); | |
| if (shouldPreserveLine) { | |
| result.push(line); | |
| i++; | |
| continue; | |
| } | |
| // Start accumulating lines that should be joined | |
| let joinedLine = line; | |
| i++; | |
| // Keep joining lines until we hit a blank line or a line that shouldn't be joined | |
| while (i < lines.length) { | |
| const nextLine = lines[i]; | |
| // Stop at blank lines | |
| if (nextLine.trim() === '') { | |
| break; | |
| } | |
| // Stop at lines that start with special formatting | |
| if (PRESERVE_LINE_PATTERN.test(nextLine)) { | |
| break; | |
| } | |
| // Join this line with a space | |
| joinedLine += ' ' + nextLine; | |
| i++; | |
| } | |
| result.push(joinedLine); | |
| } | |
| return result.join('\n'); | |
| const rawLines = body.split(/\r?\n/); | |
| const result: string[] = []; | |
| let inFencedCodeBlock = false; | |
| let prevWasCode = false; | |
| for (const rawLine of rawLines) { | |
| const line = rawLine.trimEnd(); // Remove trailing space | |
| const trimmed = line.trim(); | |
| const prev = result[result.length - 1]; | |
| const isList = (s: string) => /^\s*[>\*\-\+]\s/.test(s); | |
| const indentSpaces = line.match(/^( *)/)?.[1].length ?? 0; | |
| const isListItemContinuation: boolean = | |
| !prevWasCode && isList(prev) && indentSpaces >= 2; | |
| const isIndentedCode: boolean = | |
| !isListItemContinuation && (indentSpaces >= 4 || line.startsWith('\t')); | |
| const isCode: boolean = inFencedCodeBlock || isIndentedCode; | |
| if (line.startsWith('```')) { | |
| // Start or end of fenced code block is never joined | |
| inFencedCodeBlock = !inFencedCodeBlock; | |
| result.push(line); | |
| } else if (!line || isCode || isList(line) || !prev || !result.length) { | |
| // Empty/code/list lines never join with the previous line. | |
| // Also don't join the first line, or if previous line is blank. | |
| result.push(line); | |
| } else { | |
| // Otherwise, join with previous line | |
| result[result.length - 1] = prev + ' ' + trimmed; | |
| } | |
| prevWasCode = isCode; | |
| } | |
| return result.join('\n'); | |
alexr00
left a comment
There was a problem hiding this comment.
@copilot These notes from @justingrant are good, address them:
- You don't support code blocks that are fenced with ```
- Detection of indented code is flawed because it doesn't account for nested list items or list item continuations that have 4+ spaces at the start of the line.
- Support fenced code blocks with ``` markers - Fix indented code detection to distinguish between: - Actual code blocks (4+ spaces, not in list context) - List item continuations (2+ spaces in list context) - Nested list items (detected by list markers with indentation) - Track list context state to handle list continuations correctly - Add comprehensive tests for all new scenarios Co-authored-by: alexr00 <38270282+alexr00@users.noreply.github.com>
Addressed both issues in commit 1072b95:
Added 8 new tests covering fenced code blocks, nested lists, list continuations, and complex mixed content scenarios. |
alexr00
left a comment
There was a problem hiding this comment.
@justingrant, thanks for pointing these out!
- You don't support code blocks that are fenced with ```
- Detection of indented code is flawed because it doesn't account for nested list items or list item continuations that have 4+ spaces at the start of the line.
|
@alexr00 @copilot - After testing this PR, there's a problem: line breaks are not removed inside list items. For example, I'd expect the following commit markdown to end up as 2 lines in the resulting PR markdown. But it remains as 4 lines:
|
|
@justingrant thanks for trying it out! Opened #8335 to address this. |
Thanks @alexr00 for improving this! Unfortunately I think the follow-up PR might have some problems too, and may need some more tests. See #8335 (comment) |
When creating a PR from a single commit, the extension now unwraps commit message body lines that were wrapped at 72 characters, matching GitHub's native behavior.
Changes
Added
unwrapCommitMessageBodyfunction: Joins consecutive non-formatted lines while preserving:-,*,+, numbered)>)```)Updated
titleAndBodyFrom: Applies unwrapping to commit message bodies before setting PR descriptionTest coverage: Added 17 test cases for wrapped paragraphs, lists, quotes, indented blocks, fenced code blocks, nested lists, list continuations, and mixed content
Example
Before:
After:
Implementation Details
The function uses state tracking to correctly handle:
```markers is preserved without modificationOriginal prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.