Fix XML comment removal in imported workflows and add comprehensive test suite#1503
Merged
pelikhan merged 4 commits intogenaiscript-2from Oct 11, 2025
Merged
Conversation
Fixed bug where HTML comments in imported workflow files were not being properly removed when they contained code blocks. The issue was in the XML comment removal logic which was checking for code blocks before processing XML comments, causing code blocks within HTML comments to be preserved instead of removed. Changed the logic in xml_comments.go to: 1. Check if we're in a code block first (code blocks take precedence) 2. If in a code block, preserve everything (ignore XML comment processing) 3. If NOT in a code block, process XML comments and skip lines that are within comments 4. This ensures code blocks INSIDE XML comments are removed, while XML comments INSIDE code blocks are preserved This fix ensures that documentation in imported files (like shared/genaiscript.md and shared/simonw-llm.md) is not included in the generated prompts. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
pelikhan
reviewed
Oct 11, 2025
| } | ||
|
|
||
| // If we're in a code block, preserve the line as-is | ||
| // If we're in a code block, preserve the line as-is (ignore XML comment processing) |
Contributor
Author
There was a problem hiding this comment.
Added extensive test suite in commit 8e4d5fb with 6 new test functions covering:
Core functionality tests:
TestRemoveXMLCommentsCodeBlocksInComments- Validates the bug fix (code blocks inside XML comments are removed)TestRemoveXMLCommentsEdgeCases- 11 edge cases including empty inputs, unclosed comments, very long commentsTestRemoveXMLCommentsComplexNesting- 4 complex nesting scenarios
Helper function tests:
TestExtractCodeBlockMarker- 12 test cases for marker extraction (backticks, tildes, languages)TestIsValidCodeBlockMarker- 12 validation scenariosTestIsMatchingCodeBlockMarker- 11 matching logic testsTestRemoveXMLCommentsFromLine- 8 single-line processing cases
Total: 572 lines of test code with 60+ test cases covering main functions, helper functions, edge cases, and the specific bug that was fixed.
Added comprehensive test coverage for xml_comments.go with 6 new test functions and 60+ test cases: 1. TestRemoveXMLCommentsCodeBlocksInComments - Tests the specific bug fix where code blocks inside XML comments should be removed (not preserved) 2. TestRemoveXMLCommentsEdgeCases - Tests boundary conditions including: - Empty/whitespace inputs - Unclosed/unmatched comment markers - Very long comments - Multiple consecutive comments - Incomplete code blocks in comments 3. TestExtractCodeBlockMarker - Tests code block marker extraction with various scenarios: - Different marker types (backticks, tildes) - Different marker counts (3, 4, 5, etc.) - With/without language specifiers - Invalid markers 4. TestIsValidCodeBlockMarker - Tests validation of code block markers 5. TestIsMatchingCodeBlockMarker - Tests matching logic for opening/closing markers: - Same type matching - Different type non-matching - Marker count requirements 6. TestRemoveXMLCommentsFromLine - Tests single-line processing: - Complete/partial comments - Multiline comment boundaries - Multiple comments per line 7. TestRemoveXMLCommentsComplexNesting - Tests complex scenarios: - Interleaved comments and code blocks - Code blocks containing comment-like content - Various nesting patterns Total: 572 lines of test code covering main functions and helper functions. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Remove the Issue Summarizer workflow and update GenAI prompt generation
Fix XML comment removal in imported workflows and add comprehensive test suite
Oct 11, 2025
pelikhan
added a commit
that referenced
this pull request
Oct 11, 2025
…on (#1502) * Remove the Issue Summarizer workflow and update GenAI prompt generation - Deleted the `issue-summarizer-genaiscript.md` workflow file. - Refactored the prompt generation logic in `shared/genaiscript.md` to use echo commands instead of sed for better readability and maintainability. - Updated the `smoke-genaiscript.lock.yml` and `smoke-genaiscript.md` files to trigger on pushes to the lock file, ensuring workflows are executed on relevant changes. * Enhance repository tree map workflow to trigger on push events and specify paths for lock file updates * Update repository tree map documentation and workflow to include additional shell commands and tips * Refine repository tree map documentation by removing summary statistics section and adding exclusion for package manager directories in visualization guidelines * Simplify allowed shell tools in repository tree map documentation and workflow * Reduce timeout duration for repository tree map workflow from 10 to 5 minutes * Fix XML comment removal in imported workflows and add comprehensive test suite (#1503) * Initial plan * Fix: Remove imports frontmatter from rendered prompt Fixed bug where HTML comments in imported workflow files were not being properly removed when they contained code blocks. The issue was in the XML comment removal logic which was checking for code blocks before processing XML comments, causing code blocks within HTML comments to be preserved instead of removed. Changed the logic in xml_comments.go to: 1. Check if we're in a code block first (code blocks take precedence) 2. If in a code block, preserve everything (ignore XML comment processing) 3. If NOT in a code block, process XML comments and skip lines that are within comments 4. This ensures code blocks INSIDE XML comments are removed, while XML comments INSIDE code blocks are preserved This fix ensures that documentation in imported files (like shared/genaiscript.md and shared/simonw-llm.md) is not included in the generated prompts. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> * Add extensive test suite for XML comment removal Added comprehensive test coverage for xml_comments.go with 6 new test functions and 60+ test cases: 1. TestRemoveXMLCommentsCodeBlocksInComments - Tests the specific bug fix where code blocks inside XML comments should be removed (not preserved) 2. TestRemoveXMLCommentsEdgeCases - Tests boundary conditions including: - Empty/whitespace inputs - Unclosed/unmatched comment markers - Very long comments - Multiple consecutive comments - Incomplete code blocks in comments 3. TestExtractCodeBlockMarker - Tests code block marker extraction with various scenarios: - Different marker types (backticks, tildes) - Different marker counts (3, 4, 5, etc.) - With/without language specifiers - Invalid markers 4. TestIsValidCodeBlockMarker - Tests validation of code block markers 5. TestIsMatchingCodeBlockMarker - Tests matching logic for opening/closing markers: - Same type matching - Different type non-matching - Marker count requirements 6. TestRemoveXMLCommentsFromLine - Tests single-line processing: - Complete/partial comments - Multiline comment boundaries - Multiple comments per line 7. TestRemoveXMLCommentsComplexNesting - Tests complex scenarios: - Interleaved comments and code blocks - Code blocks containing comment-like content - Various nesting patterns Total: 572 lines of test code covering main functions and helper functions. Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Co-authored-by: Peli de Halleux <pelikhan@users.noreply.github.com> * Add changeset for XML comment fix and GenAI prompt updates 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Reduce timeout duration in smoke GenAIScript workflow from 10 to 5 minutes and update safety prompt configuration. * Remove outdated workflows and configurations for issue triage and LLM processing - Deleted the issue triage workflow for LLM from `.github/workflows/issue-triage-llm.md`. - Removed the lock file for the Genaiscript workflow from `.github/workflows/shared/genaiscript.lock.yml`. - Removed the lock file for the Simonw LLM workflow from `.github/workflows/shared/simonw-llm.lock.yml`. - Deleted the shared configuration for Simonw LLM from `.github/workflows/shared/simonw-llm.md`. - Updated the smoke test workflow for Genaiscript to include allowed tools for GitHub. * Refactor XML comment test cases for consistency and readability --------- Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Co-authored-by: Changeset Generator <github-actions[bot]@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixed a bug where HTML comments in imported workflow files were not being properly removed when they contained code blocks, causing documentation (including frontmatter examples) to appear in generated prompts.
Problem
When workflows imported shared files (like
shared/genaiscript.mdorshared/simonw-llm.md), HTML comments containing YAML code blocks showing usage examples were being included in the final prompt sent to AI agents. This happened because the XML comment removal logic checked for code blocks before processing XML comments, causing code blocks within HTML comments to be preserved instead of removed.Changes Made
Bug Fix
pkg/workflow/xml_comments.go: Modified the comment removal logic to properly handle nested structures:Test Suite
pkg/workflow/xml_comments_test.go(572 lines):TestRemoveXMLCommentsCodeBlocksInComments- Tests the specific bug fix with real-world examplesTestRemoveXMLCommentsEdgeCases- 11 boundary condition testsTestExtractCodeBlockMarker- 12 marker extraction testsTestIsValidCodeBlockMarker- 12 validation testsTestIsMatchingCodeBlockMarker- 11 matching logic testsTestRemoveXMLCommentsFromLine- 8 single-line processing testsTestRemoveXMLCommentsComplexNesting- 4 complex nesting scenario testsWorkflow Updates
smoke-genaiscript.lock.yml,issue-triage-llm.lock.yml) to remove the documentation code blocks from generated promptsImpact
The generated prompts now contain only the actual workflow instructions without the frontmatter documentation from imported files. This ensures AI agents receive clean, focused prompts without extraneous documentation.
Testing
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.