Skip to content

Fix XML comment removal in imported workflows and add comprehensive test suite#1503

Merged
pelikhan merged 4 commits intogenaiscript-2from
copilot/remove-issue-summarizer-workflow
Oct 11, 2025
Merged

Fix XML comment removal in imported workflows and add comprehensive test suite#1503
pelikhan merged 4 commits intogenaiscript-2from
copilot/remove-issue-summarizer-workflow

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Oct 10, 2025

Fixed a bug where HTML comments in imported workflow files were not being properly removed when they contained code blocks, causing documentation (including frontmatter examples) to appear in generated prompts.

Problem

When workflows imported shared files (like shared/genaiscript.md or shared/simonw-llm.md), HTML comments containing YAML code blocks showing usage examples were being included in the final prompt sent to AI agents. This happened because the XML comment removal logic checked for code blocks before processing XML comments, causing code blocks within HTML comments to be preserved instead of removed.

Changes Made

Bug Fix

  • Updated pkg/workflow/xml_comments.go: Modified the comment removal logic to properly handle nested structures:
    • Code blocks that start BEFORE XML comments are preserved (existing behavior)
    • Code blocks that are INSIDE XML comments are now removed (bug fix)
    • XML comments that are INSIDE code blocks are still preserved (existing behavior)

Test Suite

  • Added extensive test coverage in pkg/workflow/xml_comments_test.go (572 lines):
    • TestRemoveXMLCommentsCodeBlocksInComments - Tests the specific bug fix with real-world examples
    • TestRemoveXMLCommentsEdgeCases - 11 boundary condition tests
    • TestExtractCodeBlockMarker - 12 marker extraction tests
    • TestIsValidCodeBlockMarker - 12 validation tests
    • TestIsMatchingCodeBlockMarker - 11 matching logic tests
    • TestRemoveXMLCommentsFromLine - 8 single-line processing tests
    • TestRemoveXMLCommentsComplexNesting - 4 complex nesting scenario tests
    • Total: 60+ test cases covering all functions and edge cases

Workflow Updates

  • Recompiled affected workflows (smoke-genaiscript.lock.yml, issue-triage-llm.lock.yml) to remove the documentation code blocks from generated prompts

Impact

The generated prompts now contain only the actual workflow instructions without the frontmatter documentation from imported files. This ensures AI agents receive clean, focused prompts without extraneous documentation.

Testing

  • ✅ All unit tests pass (6,000+ tests)
  • ✅ Workflows compile successfully
  • ✅ New test suite provides comprehensive coverage of the fix and prevents regressions

💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Fixed bug where HTML comments in imported workflow files were not being properly removed when they contained code blocks. The issue was in the XML comment removal logic which was checking for code blocks before processing XML comments, causing code blocks within HTML comments to be preserved instead of removed.

Changed the logic in xml_comments.go to:
1. Check if we're in a code block first (code blocks take precedence)
2. If in a code block, preserve everything (ignore XML comment processing)
3. If NOT in a code block, process XML comments and skip lines that are within comments
4. This ensures code blocks INSIDE XML comments are removed, while XML comments INSIDE code blocks are preserved

This fix ensures that documentation in imported files (like shared/genaiscript.md and shared/simonw-llm.md) is not included in the generated prompts.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan October 11, 2025 00:17
}

// If we're in a code block, preserve the line as-is
// If we're in a code block, preserve the line as-is (ignore XML comment processing)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot add extensive test suite

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added extensive test suite in commit 8e4d5fb with 6 new test functions covering:

Core functionality tests:

  • TestRemoveXMLCommentsCodeBlocksInComments - Validates the bug fix (code blocks inside XML comments are removed)
  • TestRemoveXMLCommentsEdgeCases - 11 edge cases including empty inputs, unclosed comments, very long comments
  • TestRemoveXMLCommentsComplexNesting - 4 complex nesting scenarios

Helper function tests:

  • TestExtractCodeBlockMarker - 12 test cases for marker extraction (backticks, tildes, languages)
  • TestIsValidCodeBlockMarker - 12 validation scenarios
  • TestIsMatchingCodeBlockMarker - 11 matching logic tests
  • TestRemoveXMLCommentsFromLine - 8 single-line processing cases

Total: 572 lines of test code with 60+ test cases covering main functions, helper functions, edge cases, and the specific bug that was fixed.

Added comprehensive test coverage for xml_comments.go with 6 new test functions and 60+ test cases:

1. TestRemoveXMLCommentsCodeBlocksInComments - Tests the specific bug fix where code blocks inside XML comments should be removed (not preserved)

2. TestRemoveXMLCommentsEdgeCases - Tests boundary conditions including:
   - Empty/whitespace inputs
   - Unclosed/unmatched comment markers
   - Very long comments
   - Multiple consecutive comments
   - Incomplete code blocks in comments

3. TestExtractCodeBlockMarker - Tests code block marker extraction with various scenarios:
   - Different marker types (backticks, tildes)
   - Different marker counts (3, 4, 5, etc.)
   - With/without language specifiers
   - Invalid markers

4. TestIsValidCodeBlockMarker - Tests validation of code block markers

5. TestIsMatchingCodeBlockMarker - Tests matching logic for opening/closing markers:
   - Same type matching
   - Different type non-matching
   - Marker count requirements

6. TestRemoveXMLCommentsFromLine - Tests single-line processing:
   - Complete/partial comments
   - Multiline comment boundaries
   - Multiple comments per line

7. TestRemoveXMLCommentsComplexNesting - Tests complex scenarios:
   - Interleaved comments and code blocks
   - Code blocks containing comment-like content
   - Various nesting patterns

Total: 572 lines of test code covering main functions and helper functions.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Remove the Issue Summarizer workflow and update GenAI prompt generation Fix XML comment removal in imported workflows and add comprehensive test suite Oct 11, 2025
Copilot AI requested a review from pelikhan October 11, 2025 00:43
@pelikhan pelikhan marked this pull request as ready for review October 11, 2025 00:44
@pelikhan pelikhan merged commit 1e34050 into genaiscript-2 Oct 11, 2025
10 of 12 checks passed
@pelikhan pelikhan deleted the copilot/remove-issue-summarizer-workflow branch October 11, 2025 00:44
pelikhan added a commit that referenced this pull request Oct 11, 2025
…on (#1502)

* Remove the Issue Summarizer workflow and update GenAI prompt generation

- Deleted the `issue-summarizer-genaiscript.md` workflow file.
- Refactored the prompt generation logic in `shared/genaiscript.md` to use echo commands instead of sed for better readability and maintainability.
- Updated the `smoke-genaiscript.lock.yml` and `smoke-genaiscript.md` files to trigger on pushes to the lock file, ensuring workflows are executed on relevant changes.

* Enhance repository tree map workflow to trigger on push events and specify paths for lock file updates

* Update repository tree map documentation and workflow to include additional shell commands and tips

* Refine repository tree map documentation by removing summary statistics section and adding exclusion for package manager directories in visualization guidelines

* Simplify allowed shell tools in repository tree map documentation and workflow

* Reduce timeout duration for repository tree map workflow from 10 to 5 minutes

* Fix XML comment removal in imported workflows and add comprehensive test suite (#1503)

* Initial plan

* Fix: Remove imports frontmatter from rendered prompt

Fixed bug where HTML comments in imported workflow files were not being properly removed when they contained code blocks. The issue was in the XML comment removal logic which was checking for code blocks before processing XML comments, causing code blocks within HTML comments to be preserved instead of removed.

Changed the logic in xml_comments.go to:
1. Check if we're in a code block first (code blocks take precedence)
2. If in a code block, preserve everything (ignore XML comment processing)
3. If NOT in a code block, process XML comments and skip lines that are within comments
4. This ensures code blocks INSIDE XML comments are removed, while XML comments INSIDE code blocks are preserved

This fix ensures that documentation in imported files (like shared/genaiscript.md and shared/simonw-llm.md) is not included in the generated prompts.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

* Add extensive test suite for XML comment removal

Added comprehensive test coverage for xml_comments.go with 6 new test functions and 60+ test cases:

1. TestRemoveXMLCommentsCodeBlocksInComments - Tests the specific bug fix where code blocks inside XML comments should be removed (not preserved)

2. TestRemoveXMLCommentsEdgeCases - Tests boundary conditions including:
   - Empty/whitespace inputs
   - Unclosed/unmatched comment markers
   - Very long comments
   - Multiple consecutive comments
   - Incomplete code blocks in comments

3. TestExtractCodeBlockMarker - Tests code block marker extraction with various scenarios:
   - Different marker types (backticks, tildes)
   - Different marker counts (3, 4, 5, etc.)
   - With/without language specifiers
   - Invalid markers

4. TestIsValidCodeBlockMarker - Tests validation of code block markers

5. TestIsMatchingCodeBlockMarker - Tests matching logic for opening/closing markers:
   - Same type matching
   - Different type non-matching
   - Marker count requirements

6. TestRemoveXMLCommentsFromLine - Tests single-line processing:
   - Complete/partial comments
   - Multiline comment boundaries
   - Multiple comments per line

7. TestRemoveXMLCommentsComplexNesting - Tests complex scenarios:
   - Interleaved comments and code blocks
   - Code blocks containing comment-like content
   - Various nesting patterns

Total: 572 lines of test code covering main functions and helper functions.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: Peli de Halleux <pelikhan@users.noreply.github.com>

* Add changeset for XML comment fix and GenAI prompt updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Reduce timeout duration in smoke GenAIScript workflow from 10 to 5 minutes and update safety prompt configuration.

* Remove outdated workflows and configurations for issue triage and LLM processing

- Deleted the issue triage workflow for LLM from `.github/workflows/issue-triage-llm.md`.
- Removed the lock file for the Genaiscript workflow from `.github/workflows/shared/genaiscript.lock.yml`.
- Removed the lock file for the Simonw LLM workflow from `.github/workflows/shared/simonw-llm.lock.yml`.
- Deleted the shared configuration for Simonw LLM from `.github/workflows/shared/simonw-llm.md`.
- Updated the smoke test workflow for Genaiscript to include allowed tools for GitHub.

* Refactor XML comment test cases for consistency and readability

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: Changeset Generator <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants