fix(jtk): prevent wiki markup conversion from mangling hyphens and tildes#178
fix(jtk): prevent wiki markup conversion from mangling hyphens and tildes#178
Conversation
…ldes Markdown descriptions containing compound words like `signal-webapp-frontend` or `three~tier` were being corrupted because: 1. `looksLikeWikiNumberedList` treated `## Heading` as wiki numbered lists, causing markdown input to be routed through wiki-to-markdown conversion. 2. Wiki text formatting patterns for strikethrough (`-text-`) and subscript (`~text~`) matched inside compound words and file paths. Fixes: - `looksLikeWikiNumberedList` now returns false when `## ` headings are present (these are always markdown, never wiki). Only counts single-`#` lines. - Strikethrough, subscript, underline, and superscript patterns now require whitespace or string boundaries around delimiters. - Extract `replaceWikiFormatting` helper and hoist regex compilation to package level (was recompiling inside closures on every match).
- looksLikeWikiNumberedList now checks for ANY multi-hash heading (##, ###, etc.) and requires consecutive # lines (no blank lines between them) to distinguish wiki numbered lists from markdown headings - Add edge case tests: multiple h1 headings, h3 headings, tilde with numbers, punctuation-adjacent formatting, tab/newline whitespace
Replace goldmark extension.Strikethrough with hugo-goldmark-extensions/extras
which natively parses ~sub~, ^sup^, ~~del~~, and ++ins++ syntax. The ADF
converter now emits spec-compliant marks:
- ~text~ -> {"type": "subsup", "attrs": {"type": "sub"}}
- ^text^ -> {"type": "subsup", "attrs": {"type": "sup"}}
- ~~text~~ -> {"type": "strike"}
- ++text++ -> {"type": "underline"}
Previously, subscript/superscript/underline were converted to HTML tags
(<sub>, <sup>, <u>) by the wiki converter, which goldmark parsed as RawHTML
nodes that the ADF converter silently dropped.
Changes:
- shared/adf/convert.go: swap extension.Strikethrough for extras extension,
add extrasKindToMark() mapping extras AST kinds to ADF marks
- wiki.go: remove sub/sup HTML conversion (goldmark handles natively),
convert wiki +text+ to ++text++ for goldmark extras Insert extension
- Remove unused wikiSub/wikiSup regex patterns
- Document WikiToMarkdown as intended for MarkdownToADF/goldmark-extras consumption, not standalone markdown renderers (finding 2) - Document punctuation-adjacent formatting limitation in replaceWikiFormatting comment (finding 3) - Add test: adjacent h1 without blank line is intentionally treated as wiki numbered list (findings 1, 4) - Add tests: WikiToMarkdown sub/sup passthrough and underline conversion for goldmark extras (finding 2)
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 4badf05
Summary
| Reviewer | Findings |
|---|---|
| security:code-auditor | 2 |
security:code-auditor (2 findings)
tools/jtk/api/wiki.go:14
Adjacent formatted spans separated by a single space will only partially convert. The regex
(?:^|\s)-...-(?:\s|$)consumes surrounding whitespace, so-one- -two-only converts the first span. The existing 'consecutive strikethrough' test passes because it usesandbetween spans, not a single space. This is a real behavioral bug for single-space-separated adjacent spans.
💡 Suggestion - tools/jtk/api/wiki.go:88
The
##guard checkslen(trimmed) >= 3but a bare##token (length 2) would slip past it. The correct bound islen(trimmed) >= 2.
1 info-level observations excluded. Run with --verbose to include.
Completed in 2m 29s | $0.32 | sonnet
| Field | Value |
|---|---|
| Model | sonnet |
| Reviewers | hybrid-synthesis, security:code-auditor |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 2m 29s (Reviewers: 2m 17s · Synthesis: 18s) |
| Cost | $0.32 |
| Tokens | 62.2k in / 9.1k out |
| Turns | 2 |
- Fix adjacent single-space-separated spans (-one- -two-) where the first match consumed the shared whitespace boundary. Run replacement twice to catch spans whose leading whitespace was consumed by the prior match. - Fix ## guard: len >= 2 not >= 3, so bare "##" token is caught. - Add test for adjacent strikethrough with single space.
Both findings addressed in fddc6f7: adjacent spans fixed with double-pass replacement, ## length guard fixed.
- Widen formatting boundary from whitespace-only to whitespace + common punctuation (parens, brackets, quotes). Patterns like (-deleted-) now convert correctly while compound words are still protected. - Add inline comment explaining consecutive h1 tradeoff in looksLikeWikiNumberedList. - Add end-to-end TestMarkdownToADF_CompoundWordsEndToEnd covering the original bug: markdown with signal-webapp-frontend, file paths, and three-tier through the full MarkdownToADF pipeline.
- Expand comment on ## heuristic explaining Jira nested numbered list tradeoff - Document intentional before/after boundary asymmetry (opening vs closing punctuation) - Add test for superscript in compound word (x^2^y) to match existing tilde test - Add test locking period-before-delimiter boundary behavior
The standard parser (for plain markdown) now uses extras.Delete only (double-tilde ~~text~~ strikethrough). Subscript, superscript, and insert are only enabled in the wiki parser, used when input is detected as Jira wiki markup. This prevents even-tilde compound words like "signal~webapp~frontend" from being mangled by goldmark subscript processing in non-wiki input. Also addresses review findings: - Fix variable shadowing: rename loop var t -> s in end-to-end test - Add ASCII delimiter assumption comment on replaceWikiFormatting - Clarify ^ anchor scope in boundary pattern comments - Add even-tilde and even-caret compound word tests
Coverage gaps filled: - Shared-package parser split tests: ToDocument vs ToDocumentWiki vs ToJSON proving standard parser omits subsup/underline, wiki parser produces them - Inline-only wiki formatting (H~2~O, x^2^, +important+) through MarkdownToADF: proves auto-detection correctly falls through to safe parser - Nested wiki ## under # false negative test lock - Square bracket boundary test ([-deleted-]) Fixes: - Add ] to wikiBoundaryAfter so [-deleted-] converts correctly - Strengthen WikiToMarkdown doc comment: explicitly notes pipeline coupling to adf.ToDocumentWiki and warns callers MUST use wiki parser
Adds contract note: auto prioritizes not corrupting plain markdown over detecting wiki edge cases. Callers that know the input format should bypass heuristics via WikiToMarkdown + adf.ToDocumentWiki.
Rename makes the pipeline coupling explicit: the output is not general-purpose markdown but a dialect tuned for adf.ToDocumentWiki. Strengthen CompoundWordsEndToEnd test to verify compound words appear within single ADF text nodes rather than just checking concatenated text. Also document ToJSON parser choice to match ToDocument/ToDocumentWiki.
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 2c674e6
Summary
| Reviewer | Findings |
|---|---|
| security:code-auditor | 3 |
security:code-auditor (3 findings)
tools/jtk/api/wiki.go:165
WikiToADFMarkdown is exported but its output is only safe to parse with adf.ToDocumentWiki. If any caller uses adf.ToDocument on the output,
textand ^text^ won't produce ADF marks and will appear as raw characters. There is no compile-time enforcement of this contract. Consider either unexporting the function or returning a typed wrapper that forces the correct parser.
💡 Suggestion - tools/jtk/api/wiki.go:47
wikiStrikeInner uses [^-]+ which is more permissive than the outer pattern's [^\s-][^-]*[^\s-]. If the outer regex ever matched a span the inner doesn't expect, FindStringSubmatch could return an unexpected capture. The two patterns should be aligned or the inner derived from the outer to avoid silent mismatches.
💡 Suggestion - tools/jtk/api/wiki_test.go:315
TestConvertWikiTextFormatting_EdgeCases does not cover strikethrough applied to text that is itself a valid markdown construct (e.g., "-
code-" or "-bold-"). These exist in real Jira wiki content and the boundary regex may or may not handle them correctly.
Completed in 3m 12s | $0.30 | sonnet
| Field | Value |
|---|---|
| Model | sonnet |
| Reviewers | hybrid-synthesis, security:code-auditor |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 3m 12s (Reviewers: 3m 02s · Synthesis: 16s) |
| Cost | $0.30 |
| Tokens | 67.0k in / 11.7k out |
| Turns | 2 |
Catches up documentation for recent features and fixes: - README: document --fields flag, auto-pagination, fields command group, users get subcommand, --assignee none, multi-value --field, and escape sequences in comment --body - CHANGELOG: add entries for PRs #178, #180, #182, #186-189 - integration-tests: add test cases for auto-pagination, --fields, users get, --assignee none, multi-value --field, and escape sequences Closes #183
#190) Catches up documentation for recent features and fixes: - README: document --fields flag, auto-pagination, fields command group, users get subcommand, --assignee none, multi-value --field, and escape sequences in comment --body - CHANGELOG: add entries for PRs #178, #180, #182, #186-189 - integration-tests: add test cases for auto-pagination, --fields, users get, --assignee none, multi-value --field, and escape sequences Closes #183
Summary
## Headingto be routed through wiki-to-markdown conversionsignal-webapp-frontendandthree~tierreplaceWikiFormattinghelper and hoist regex compilation to package levelProblem
When creating issues with markdown descriptions containing compound words:
The output was mangled:
signal-webapp-frontend→signal~~webapp~~frontend→ rendered assignal<s>webapp</s>frontendthree~tier→three<sub>tier</sub>2026-03-12-design.md→2026<sub>03</sub><sub>12</sub>~design.mdRoot cause 1:
looksLikeWikiNumberedListcounted markdown## Headinglines as wiki# list-itemlines, triggering wiki-to-markdown conversion on pure markdown input.Root cause 2: Wiki formatting patterns (
-text-for strikethrough,~text~for subscript) matched inside compound words because they only checked for non-whitespace content, not word boundaries.Fix
looksLikeWikiNumberedListnow returnsfalseimmediately when any##heading is present. Only counts single-#lines as potential wiki numbered lists.replaceWikiFormattinghelper to eliminate code duplication and moved regex compilation to package-levelvarblock.Test plan
TestIsWikiMarkup_MarkdownHeadings— verifies## Headingnot detected as wikiTestWikiToMarkdownPreservesMarkdowncases — hyphenated words, tildes, file pathsTestConvertWikiTextFormatting_EdgeCases— 9 cases: start-of-line, end-of-string, consecutive, compound words, file pathsgo test ./...passes across all packages