Skip to content

Markdoc support in content collections #496

@matthewp

Description

@matthewp

Body

Summary

This is a proposal to add Markdoc support to content collections.

Background & Motivation

We've received multiple user requests for Markdoc support since its public launch. In fact, we've seen early community projects bringing Markdoc to robust project themes.

Markdoc is also designed to solve existing limitations of MDX in Astro.

  1. Performance suffers at scale. Unlike plain Markdown that outputs a string of HTML, MDX outputs a JavaScript module of JSX components. This requires Astro and Babel preprocessing for even the simplest MDX documents. Notably, this required a bump to our maximum memory usage when building / deploying docs.astro.build after migrating to MDX.
  2. Your content is tied to your UI. MDX can import styles and components directly, which is convenient from a developer standpoint. However, this causes issues when content needs to be reused in multiple contexts. A common example is RSS, where you may want a component-rich version from your blog and a simplified HTML output for your RSS feed.

Markdoc is built to solve (2) by separating content from the components, styles, and assets you choose to render. You can use an Astro component renderer when using on your site, Markdoc's own html renderer for RSS, and even write your own renderer to traverse Markdoc pages yourself. (1) Is something we're excited to test, requiring a thorough performance benchmark.

The content collections API was built generically to support this future, choosing format-agnostic naming like data instead of frontmatter and body instead of rawContent. Because of this, introducing new authoring formats is possible without breaking changes.

Goals

  • Create an @astrojs/markdoc integration that adds .mdoc support to content collections.
  • Support Astro components and server-rendered UI components (React, Vue, Svelte, etc) within Markdoc files. Note this excludes client-rendered UI components (see non-goals).
  • Benchmark Markdoc performance against MDX at 1000+ documents. This addresses problem (1) from the previous section. Metrics to compare: SSG build time, SSG and SSR build memory usage, SSR response speed.

Non-goals

  • ESM import and src/pages/ support for Markdoc files. See discussion for context.
  • Allowing the .md extension. This would mean overriding Astro's .md renderer, which is tightly coupled to remark and your markdown configuration options today. We agree using .md for Markdoc is a common use case, and deserves a separate proposal to make Astro's Markdown rendering flexible.
  • A solution for client-rendered UI components. Unlike MDX, Markdoc doesn't have a concept of directives, and our compiler doesn't have a clear way to dynamically render client-side components (see challenges). We will recommend users wrap their components in an Astro component to apply the client: directive.
  • Full alignment with Markdown and MDX rendered result. Namely, the computed headings property (which can be tackled in future releases) and frontmatter manipulation via remark (since remark is incompatible with Markdoc).

Example implementation

Markdoc will be introduced as an integration. To standardize our process for adding new collection teams, we may experiment with a (private) integration helper internally. This example shows an addContentEntryType hook to setup the .mdoc extension, and attach logic for parsing the data and body properties:

// @astrojs/markdoc/index.ts
export const markdoc: AstroIntegration = () => ({
		'astro:config:setup'({ addContentEntryType }) {
			addContentEntryType({
				extensions: ['.mdoc'],
				parser: '@astrojs/markdoc/contentEntryParser',
			});
		}
	}
});

// @astrojs/markdoc/contentEntryParser.ts
import parseFrontmatter from 'gray-matter';
export default {
	getEntryInfo({ contents }) {
		const parsed = parseFrontmatter(contents);
		return {
			// The unparsed data object that can be passed to a Zod schema.
			data: parsed.data,
			// The body of the data file. This should be the raw file contents with metadata (i.e. frontmatter block) stripped
			body: parsed.content,
			// (Optional) The untouched frontmatter block with newlines and formatting preserved. Used for computing error line hints.
			rawData: parsed.matter,
		}
	}
}

// astro.config.mjs
import markdoc from '@astrojs/markdoc';

export default {
	integrations: [markdoc()],
}

Example Usage

Say you've authored a collection of blog posts using Markdoc. You can store these entries as a blog collection, identically to Markdown or MDX:

src/content/
	blog/
		# Could also use `.md`
		post-1.mdoc
		post-2.mdoc
		post-3.mdoc
...

Then, you can query entry frontmatter with the same getCollection() and getEntryBySlug() APIs:

---
import { getCollection, getEntryBySlug } from 'astro:content';

const blog = await getCollection('blog');
const firstEntry = await getEntryBySlug('blog', 'post-1');
---

Users should also be free to render Markdoc contents using a Content component. This will be exposed from the render() result, and feature two props:

  • components?: Record<string, ComponentRenderer>: A mapping from Markdoc tags or elements to Astro components.
  • config?: import('@markdoc/markdoc').Config: An (optional) Markdoc config to be used during the transformation step.
---
import Title from '../components/Title.astro';
import Marquee from '../components/Marquee.astro';
import { getEntryBySlug } from 'astro:content';

const mdocEntry = await getEntryBySlug('blog', 'test');
const { Content } = await mdocEntry.render();
---

<html lang="en">
	<body>
		<Content
			config={{
				variables: { underlineTitle: true },
			}}
			components={{
				h1: Title,
				marquee: Marquee,
			}}
		/>
	</body>
</html>

Sharing config

This solution is flexible, but we expect users to reuse config and components across their project. For this, we will recommend creating a utility component to encapsulate that config. Here is one example that can render any blog collection entry with an {% aside /%} shortcode:

---
// src/components/BlogContent.astro
import Aside from './Aside.astro';
import type { CollectionEntry } from 'astro:content';

type Props = {
	entry: CollectionEntry<'blog'>;
};

const { entry } = Astro.props;
const { Content } = await entry.render();
---

<Content
	config={{
		tags: {
			aside: {
				render: 'Aside',
				attributes: {
					type: { type: String },
					title: { type: String },
				},
			},
		},
	}}
	components={{ Aside }}
/>

Now, you can pass any blog collection entry to render the result with this config:

---
import { getEntryBySlug } from 'astro:content';
import BlogContent from '../components/BlogContent.astro';

const mdocEntry = await getEntryBySlug('blog', 'test');
---

<h1>{intro.data.title}</h1>
<BlogContent entry={mdocEntry} />

See this example video for more.

Advanced use case: component prop mapping

Component renderers can also include a props() function to map Markdoc attributes and AST entries to component props. This is useful when:

  • computing props based on the Markdoc AST
  • mapping Markdoc's generated attributes to prop names

This example maps Markdoc's generated data-language attribute for code blocks to the lang prop used by Astro's Code component, and stringifies the contents to HTML for use with Shiki:

---
import { Code } from 'astro/components';
import { Title } from '../components/Title.astro';
import Markdoc from '@markdoc/markdoc';
...
---

...
<Content
	components={{
		h1: Title,
		pre: {
			component: Code,
			props({ attributes, getTreeNode }) {
				return {
					lang: attributes['data-language'],
					code: Markdoc.renderers.html(getTreeNode().children),
				};
			},
		},
	}}
/>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Implemented

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions