ensure paragraph start tags begin a paragraph#108
Closed
mirabilos wants to merge 1 commit intomatthewwithanm:developfrom
Closed
ensure paragraph start tags begin a paragraph#108mirabilos wants to merge 1 commit intomatthewwithanm:developfrom
mirabilos wants to merge 1 commit intomatthewwithanm:developfrom
Conversation
Collaborator
|
Hi! This breaks some working code, as there will be a few places with loads of empty lines, for example: |
Author
|
AlexVonB dixit:
Hi! This breaks some working code, as there will be a few places with loads of empty lines, for example:
md('<blockquote><p>Hello</p><p>Hello again</p></blockquote>')
> Hello
>
>
>
> Hello again
Postprocess. It’s trivial, and easier to fix there than in Markdownify.
[…]
# convert and clean up
text = MarkdownConverter(strip=['img']).convert_soup(html)
text = re.sub(' \n \n', '\n\n', '\n' + text + '\n')
text = re.sub('(\n> )+\n', '\n> \n', '\n' + text + '\n')
text = re.sub(' *\n\n+', '\n\n', text)
return text.strip()
|
jsm28
added a commit
to jsm28/python-markdownify
that referenced
this pull request
Apr 9, 2024
There are various cases in which inline text fails to be separated by (sufficiently many) newlines from adjacent block content. A paragraph needs a blank line (two newlines) separating it from prior text, as does an underlined header; an ATX header needs a single newline separating it from prior text. A list needs at least one newline separating it from prior text, but in general two newlines (for an ordered list starting other than at 1, which will only be recognized given a blank line before). To avoid accumulation of more newlines than necessary, take care when concatenating the results of converting consecutive tags to remove redundant newlines (keeping the greater of the number ending the prior text and the number starting the subsequent text). This is thus an alternative to matthewwithanm#108 that tries to avoid the excess newline accumulation that was a concern there, as well as fixing more cases than just paragraphs, and updating tests. Fixes matthewwithanm#92 Fixes matthewwithanm#98
Collaborator
|
Fixed by #120 . Thanks for your PR and patience! |
Wuhall
pushed a commit
to Wuhall/python-markdownify
that referenced
this pull request
May 21, 2025
There are various cases in which inline text fails to be separated by (sufficiently many) newlines from adjacent block content. A paragraph needs a blank line (two newlines) separating it from prior text, as does an underlined header; an ATX header needs a single newline separating it from prior text. A list needs at least one newline separating it from prior text, but in general two newlines (for an ordered list starting other than at 1, which will only be recognized given a blank line before). To avoid accumulation of more newlines than necessary, take care when concatenating the results of converting consecutive tags to remove redundant newlines (keeping the greater of the number ending the prior text and the number starting the subsequent text). This is thus an alternative to matthewwithanm#108 that tries to avoid the excess newline accumulation that was a concern there, as well as fixing more cases than just paragraphs, and updating tests. Fixes matthewwithanm#92 Fixes matthewwithanm#98
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #92 and is the only remaining code change I have (as opposed to wrapping Markdownify)