-
-
Notifications
You must be signed in to change notification settings - Fork 3.3k
renderMarkdown doesn't do what we all think it does #15285
Description
Astro Info
Astro v5.16.0
Node v22.22.0
System macOS (arm64)
Package Manager pnpm
Output static
Adapter @astrojs/node (v9.5.1)
Integrations @astrojs/starlight (v0.36.2)
If this issue only occurs in one browser, which browser is a problem?
No response
Describe the Bug
I think I found a bug in how renderMarkdown works, or at least in how it's portrayed.
Summary
The RenderedContent object that it returns is supposed to look like
Promise<{
html: string;
metadata?: {
[key: string]: unknown;
imagePaths?: string[] | undefined;
headings?: MarkdownHeading[] | undefined;
frontmatter?: Record<string, any>;
} | undefined;
}>It does return this structure, but the contents are mangled. For example:
{
"html": "<hr>\n<h2 id=\"title-headless-wordpress-toolkitdescription-a-modern-framework-agnostic-collection-of-plugins-and-packages-for-building-headless-wordpress-applications\">title: “Headless WordPress Toolkit”\ndescription: “A modern, framework-agnostic collection of plugins and packages for building headless [...]",
"metadata": {
"headings": [
{
"depth": 2,
"slug": "title-headless-wordpress-toolkitdescription-a-modern-framework-agnostic-collection-of-plugins-and-packages-for-building-headless-wordpress-applications",
"text": "title: “Headless WordPress Toolkit”\ndescription: “A modern, framework-agnostic collection of plugins and packages for building headless WordPress applications.”"
},
//...
],
//...
"frontmatter": {}
}
}Important
Notice that the metadata => frontmatter field is empty, and both the html and frontmatter => headings fields include frontmatter content. Which they shouldn't.
Premise
Due to the promise of "it works like glob" from the docs and the fact that it returns this structured object, I'd expect it to correctly return the structure object. However, after digging into what glob actually does. This function fails to be very useful or is inaccurately documented.
- Glob separately parses the MD file with, I believe,
parseFrontmatterfrom@astrojs/markdown-remarkhere. - Then passes the result of that function + the raw code to the processor generated by
createMarkdownProcessorhere. - That processor takes 2 arguments, the raw code and the result of
parseFrontmatter. It then combines those 2 things into a single structured response in the format ofRenderedContent.
Conclusing
The problem is that in the renderMarkdown implementation, only the raw MD is passed, not the parsed MD resulting from parseFrontmatter. Thus the result is incomplete data and a mangled response.
Some experimenting shows that passing parsed body(i.e. content) from the parseFrontmatter response as the first argument and the rest of parseFrontmatter as the second to the remark parser correctly generates a response. e.g.,
const rawFile = await fetchFile()
const {content, ...parsedFile} = parseFrontmatter(rawFile);
const rendered = await renderMarkdown({ content, data: parsedFile }); // this is modified function pnpm patchSolutions
I see two possible explanations or solutions.
Option 1
I'm wrong. That renderMarkdown was never intended to parse frontmatter, only convert clean MD content into HTML. While not impossible I find this unlikely for several reasons:
- The documentation sets the expectation that this works "just like
glob". Glob correctly handles frontmatter and more. - If all this function was designed to do was convert a MD string to an HTML string, it'd return a string. Not a complex structured object identical to how
globhandles it.
If this is the case, documentation should be updated and more info included on how to actually mimic glob.
Option 2
I'm right and either due to unnoticed breaking changes or bugs, this function needs fixing.
In which case I belive we could update the internals of renderMarkdown to correctly execute parseFrontmatter.
Outstanding Questions
Answed Belowglobhandles parsing for "Markdown, MDX, Markdoc, JSON, YAML, and TOML". I'd expect this function to not handle the latter 3. But what about MDX and Marcdoc? Are those seerate parsers or included withinrenderMarkdown? If not, how can we solve them?
Related Issues
When looking for existing issues I stumbled across #14620. While related to images instead of frontmatter, it seems likely that it's due to the same or a related bugs.
Contribution
I'm happy to contribute something as this is actively hindering our ability to implement remote Markdown docs integration with Starlight.
What's the expected result?
I would expect the RenderedContent response from renderMarkdown to correctly parse my markdown and return a valid response with all relevant data.
Link to Minimal Reproducible Example
https://stackblitz.com/edit/github-fwjqules?file=src%2FexampleLoader.ts
Participation
- I am willing to submit a pull request for this issue.