Conversation
This is a giant change for remark. It replaces the 5+ year old internals with a new low-level parser: <https://github.com/micromark/micromark> The old internals have served billions of users well over the years, but markdown has changed over that time. micromark comes with 100% CommonMark (and GFM as an extension) compliance, and (WIP) docs on parsing rules for how to tokenize markdown with a state machine: <https://github.com/micromark/common-markup-state-machine>. micromark, and micromark in remark, is a good base for the future. `remark-parse` now defers its work to [`micromark`][micromark] and [`mdast-util-from-markdown`][from-markdown]. `micromark` is a new, small, complete, and CommonMark compliant low-level markdown parser. `from-markdown` turns its tokens into the previously (and still) used syntax tree: [mdast][]. Extensions to `remark-parse` work differently: they’re a two-part act. See for example [`micromark-extension-footnote`][micromark-footnote] and [`mdast-util-footnote`][from-markdown-footnote]. * change: `commonmark` is no longer an option — it’s the default * move: `gfm` is no longer an option — moved to `remark-gfm` * remove: `pedantic` is no longer an option — this legacy and buggy flavor of markdown is no longer widely used * remove: `blocks` is no longer an options — it’s no longer suggested to change the internal list of HTML “block” tag names remark-stringify now defers its work to [`mdast-util-to-markdown`][to-markdown]. It’s a new and better serializer with powerful features to ensure serialized markdown represents the syntax tree (mdast), no matter what plugins do. Extensions to it work differently: see for example [`mdast-util-footnote`][to-markdown-footnote]. * change: `commonmark` is no longer an option, it’s the default * change: `emphasis` now defaults to `*` * change: `bullet` now defaults to `*` * move: `gfm` is no longer an option — moved to `remark-gfm` * move: `tableCellPadding` — moved to `remark-gfm` * move: `tablePipeAlign` — moved to `remark-gfm` * move: `stringLength` — moved to `remark-gfm` * remove: `pedantic` is no longer an option — this legacy and buggy flavor of markdown is no longer widely used * remove: `entities` is no longer an option — with CommonMark there is almost never a need to use character references, as character escapes are preferred * new: `quote` — you can now prefer single quotes (`'`) over double quotes (`"`) in titles All of these are for CommonMark compatibility. Most of them are inconsequential. * **notable**: references (as in, links `[text][id]` and images `![alt][id]`) are no longer present as such in the syntax tree if they don’t have a corresponding definition (`[id]: example.com`). The reason for this is that CommonMark requires `[text *emphasis start][undefined] emphasis end*` to be emphasis. * **notable**: it is no longer possible to use two blank lines between two lists or a list and indented code. CommonMark prohibits it. For a solution, use an empty comment to end lists (`<!---->`) * inconsequential: whitespace at the start and end of lines in paragraphs is now ignored * inconsequential: `<mailto:foobarbaz>` are now correctly parsed, and the scheme is part of the tree * inconsequential: indented code can now follow a block quote w/o blank line * inconsequential: trailing indented blank lines after indented code are no longer part of that code * inconsequential: character references and escapes are no longer present as separate text nodes * inconsequential: character references which HTML allows but CommonMark doesn’t, such as `©` w/o the semicolon, are no longer recognized * inconsequential: the `indent` field is no longer available on `position` * fix: multiline setext headings * fix: lazy lists * fix: attention (emphasis, strong) * fix: tabs * fix: empty alt on images is now present as an empty string * …plus a ton of other minor previous differences from CommonMark * get folks to use this and report problems! * make `remark-gfm` * start making next branches for plugins * get types into {from,to}-markdown and use them here Closes GH-218. Closes GH-306. Closes GH-315. Closes GH-324. Closes GH-398. Closes GH-402. Closes GH-407. Closes GH-439. Closes GH-450. Closes GH-459. Closes GH-493. Closes GH-494. Closes GH-497. Closes GH-504. Closes GH-517. Closes GH-521. Closes GH-523. Closes remarkjs/remark-lint#111. [micromark]: https://github.com/micromark/micromark [from-markdown]: https://github.com/syntax-tree/mdast-util-from-markdown [to-markdown]: https://github.com/syntax-tree/mdast-util-to-markdown [micromark-footnote]: https://github.com/micromark/micromark-extension-footnote/blob/main/index.js [to-markdown-footnote]: https://github.com/syntax-tree/mdast-util-footnote/blob/main/to-markdown.js [from-markdown-footnote]: https://github.com/syntax-tree/mdast-util-footnote/blob/main/from-markdown.js [mdast]: https://github.com/syntax-tree/mdast
ChristianMurphy
approved these changes
Oct 1, 2020
This was referenced Oct 1, 2020
Closed
This was referenced Oct 5, 2020
1 task
Member
Author
Update on the ecosystemI checked with the community (see the referenced issues above). Most plugins are fine. I’m in contact with authors of stuff that isn’t. Here is a breakdown of the stuff maintained in the remarkjs org. ChangesThese plugins have new versions which work with the new parser/compiler, but don’t with remark@prev.
Tiny changesThese plugins received a tiny update to match commonmark, but otherwise work w/ remark@next and remark@prev the same.
No changesThese plugins did not need any update at all for remark@next
ArchivedNot used a lot, too much time in updating:
|
fisker
added a commit
to fisker/prettier
that referenced
this pull request
Oct 14, 2020
Member
Author
|
This is now released in |
1 task
Martii
added a commit
to Martii/OpenUserJS.org
that referenced
this pull request
Oct 19, 2020
* Please read their CHANGELOGs * *remark* , *remark-strip-html* , and *strip-markdown* are on hold since they are interdependent and needs in-depth retesting. See craftzdog/remark-strip-html#2 , remarkjs/remark#536 , and remarkjs/strip-markdown@0ceb371#diff-5a831ea67cf5cf8703b0de46901ab25bd191f56b320053be9332d9a3b0d01d15 * *sanitize-html* CHANGELOG at https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md#200-2020-09-23 . We don't DOM insert , pro *node* is acceptable, and we override `allowedTags` to usually match GH. * *spdx-license-ids* is going to take some time as a bunch of new ones have been added and need to be cross-checked/restricted. On hold. * *moment* is in "maintenance mode" and deprecated. Will address this much later. * Delete op retested
Martii
added a commit
to OpenUserJS/OpenUserJS.org
that referenced
this pull request
Oct 19, 2020
* Please read their CHANGELOGs * *remark* , *remark-strip-html* , and *strip-markdown* are on hold since they are interdependent and needs in-depth retesting. See craftzdog/remark-strip-html#2 , remarkjs/remark#536 , and remarkjs/strip-markdown@0ceb371#diff-5a831ea67cf5cf8703b0de46901ab25bd191f56b320053be9332d9a3b0d01d15 * *sanitize-html* CHANGELOG at https://github.com/apostrophecms/sanitize-html/blob/main/CHANGELOG.md#200-2020-09-23 . We don't DOM insert , pro *node* is acceptable, and we override `allowedTags` to usually match GH. * *spdx-license-ids* is going to take some time as a bunch of new ones have been added and need to be cross-checked/restricted. On hold. * *moment* is in "maintenance mode" and deprecated. Will address this much later. * Delete op retested Auto-merge
2 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a giant change for remark. It replaces the 5+ year old internals with a new low-level parser: https://github.com/micromark/micromark. The old internals have served billions of users well over the years, but markdown has changed over that time. micromark comes with 100% CommonMark (and GFM as an extension) compliance, and (WIP) docs on parsing rules for how to tokenize markdown with a state machine: https://github.com/micromark/common-markup-state-machine. micromark, and micromark in remark, is a good base for the future.
remark-parse
remark-parsenow defers its work tomicromarkandmdast-util-from-markdown.micromarkis a new, small, complete, and CommonMark compliant low-level markdown parser.from-markdownturns its tokens into the previously (and still) used syntax tree: mdast. Extensions toremark-parsework differently: they’re a two-part act. See for examplemicromark-extension-footnoteandmdast-util-footnote.commonmarkis no longer an option — it’s the defaultgfmis no longer an option — moved toremark-gfmpedanticis no longer an option — this legacy and buggy flavor of markdown is no longer widely usedblocksis no longer an options — it’s no longer suggested to change the internal list of HTML “block” tag namesremark-stringify
remark-stringify now defers its work to
mdast-util-to-markdown. It’s a new and better serializer with powerful features to ensure serialized markdown represents the syntax tree (mdast), no matter what plugins do. Extensions to it work differently: see for examplemdast-util-footnote.optionscommonmarkis no longer an option, it’s the defaultemphasisnow defaults to*bulletnow defaults to*gfmis no longer an option — moved toremark-gfmtableCellPadding— moved toremark-gfmtablePipeAlign— moved toremark-gfmstringLength— moved toremark-gfmpedanticis no longer an option — this legacy and buggy flavor of markdown is no longer widely usedentitiesis no longer an option — with CommonMark there is almost never a need to use character references, as character escapes are preferredquote— you can now prefer single quotes (') over double quotes (") in titlesChanges to output / the tree
All of these are for CommonMark compatibility. They’re all fixes. Most of them are inconsequential to most folks.
[text][id]and images![alt][id]) are no longer present as such in the syntax tree if they don’t have a corresponding definition ([id]: example.com). The reason for this is that CommonMark requires[text *emphasis start][undefined] emphasis end*to be emphasis.<!---->)<mailto:foobarbaz>are now correctly parsed, and the scheme is part of the tree©w/o the semicolon, are no longer recognizedindentfield is no longer available onpositionFor now
Up next
remark-gfmCloses
Closes GH-218.
Closes GH-306.
Closes GH-315.
Closes GH-324.
Closes GH-398.
Closes GH-402.
Closes GH-407.
Closes GH-439.
Closes GH-450.
Closes GH-459.
Closes GH-493.
Closes GH-494.
Closes GH-497.
Closes GH-504.
Closes GH-517.
Closes GH-521.
Closes GH-523.
Closes remarkjs/remark-lint#111.
Thanks
Thanks to Salesforce, Gatsby, Vercel, and Netlify, and our other backers for sponsoring the work on micromark!
To support our continued work, back us on OpenCollective!