Skip to content

Fix parsing of spaces in code#786

Merged
Martin1887 merged 1 commit intopulldown-cmark:masterfrom
notriddle:notriddle/code-spaces
Jan 15, 2024
Merged

Fix parsing of spaces in code#786
Martin1887 merged 1 commit intopulldown-cmark:masterfrom
notriddle:notriddle/code-spaces

Conversation

@notriddle
Copy link
Collaborator

Fixes #661

The changed case in regression.txt actually aligns our behavior with commonmark.js. Check the "HTML" tab.

This change simplifies everything by having the code span maker read the text directly, instead of deciphering the tree with all the emphasis data and trimmed spaces (that still gets used if it later finds out that MaybeCode isn't code).

The changed case in regression.txt actually aligns our behavior
with [commonmark.js](https://spec.commonmark.org/dingus/?text=%3E%20%60%0A%3E%20%60&smart=1).
Check the "HTML" tab.

This change simplifies everything by having the code span maker
read the text directly, instead of deciphering the tree with
all the emphasis data and trimmed spaces (that still gets used
if it later finds out that MaybeCode isn't code).
@notriddle
Copy link
Collaborator Author

notriddle commented Nov 25, 2023

CC https://talk.commonmark.org/t/code-span-with-leading-newline/4546

While fuzz testing vs commonmark-hs, I found a corner case with the ASCII string 60 0A 20 60:

`
 `

This PR produces two spaces (it converts the newline to a space, and, since the resulting string contains only spaces, it leaves the two spaces alone), while commonmark.js only produces one.

This is apparently ambiguous enough that markdown-it went with the same interpretation that I did, and it's a corner case that well-formatted documents shouldn't hit, but still. Full disclosure: the reference implementation treats newlines differently than this PR does.

@Martin1887
Copy link
Collaborator

I understand the same, it is probably a corner-case bug of the reference implementation.

@Martin1887 Martin1887 merged commit a0ec43e into pulldown-cmark:master Jan 15, 2024
@notriddle notriddle deleted the notriddle/code-spaces branch January 15, 2024 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parsing inline code with "\n " loses both newline and space

2 participants