Skip to content

Referencing data from content collections #530

@bholmesdev

Description

@bholmesdev

Details

Summary

Introduce a standard to store data separately from your content (ex. JSON files), with a way to "reference" this data from existing content collections.

Background & Motivation

Content collections are restricted to supporting .md, .mdx, and .mdoc files. This is limiting for other forms of data you may need to store, namely raw data formats like JSON.

Taking a blog post as the example, there will likely be author information thats reused across multiple blog posts. To standardize updates when, say, updating an author's profile picture, it's best to store authors in a separate data entry, with an API to reference this data from any blog post by ID.

The content collections API was built generically to support this future, choosing format-agnostic naming like data instead of frontmatter and body instead of rawContent. Because of this, expanding support to new data formats without API changes is a natural progression.

Use cases

We have a few use cases in mind considering data collections and data references. We expect this list to grow through the RFC discussion and learning from our community!

  • Blog post meta info. Common cases include author bios, project contributors, and tags
  • i18n translations. Many content sites and translation libraries work from key / value pairs stored as JSON. For example, an i18n/ collection containing en.json, fr.json, etc.
  • Image asset metadata. You may want to reference reusable alt text or image widths and heights for standard assets. For example, an images/banner.json file containing the src as a string, alt text, and a preferred width

Goals

  • Introduce JSON collection support, configurable and queryable with similar APIs to content collections.
  • Determine where data collections are stored. We may introduce a new src/data/ directory distinct from src/content/, or simply allow data collections within src/content/.
  • Introduce an API to reference this data from existing content collections by ID. This is based on the strongest user need for data collections: referencing metadata (ex. pull in post authors from a blog post).
  • Consider Both one-to-one and one-to-many relationships between content and data (ex. allow passing a list of author IDs in your frontmatter).

Non-goals

  • User-facing APIs to introduce new data collection formats like YAML or TOML. We recognize the value of community plugins to introduce new formats, and we will experiment with a pluggable API internally. Still, a finalized user-facing API will be considered out-of-scope.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Implemented

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions