Referencing data from content collections

## Details 
- Accepted Date: 24/03/23
- Reference Issues/Discussions: https://github.com/withastro/roadmap/discussions/477, https://github.com/withastro/roadmap/discussions/525
- Author: @bholmesdev
- Core Champions: @bholmesdev, @tony-sull
- Implementation PR: 

## Summary

Introduce a standard to store data separately from your content (ex. JSON files), with a way to "reference" this data from existing content collections.

## Background & Motivation

Content collections are restricted to supporting `.md`, `.mdx`, and `.mdoc` files. This is limiting for other forms of data you may need to store, namely raw data formats like JSON.

Taking a blog post as the example, there will likely be author information thats reused across multiple blog posts. To standardize updates when, say, updating an author's profile picture, it's best to store authors in a separate data entry, with an API to reference this data from any blog post by ID.

The content collections API was built generically to support this future, choosing format-agnostic naming like `data` instead of `frontmatter` and `body` instead of `rawContent`. Because of this, expanding support to new data formats without API changes is a natural progression.

### Use cases

We have a few use cases in mind considering data collections and data references. We expect this list to grow through the RFC discussion and learning from our community!

- **Blog post meta info.** Common cases include author bios, project contributors, and tags
- **i18n translations.** Many content sites and translation libraries work from key / value pairs stored as JSON. For example, an `i18n/` collection containing `en.json`, `fr.json`, etc.
- **Image asset metadata.** You may want to reference reusable `alt` text or image widths and heights for standard assets. For example, an `images/banner.json` file containing the `src` as a string, alt text, and a preferred `width`

## Goals
- **Introduce JSON collection support,** configurable and queryable with similar APIs to content collections.
- **Determine where data collections are stored.** We may introduce a new `src/data/` directory distinct from `src/content/`, or simply allow data collections within `src/content/`.
- **Introduce an API to reference this data from existing content collections by ID.** This is based on the strongest user need for data collections: referencing metadata (ex. pull in post authors from a blog post).
- **Consider** Both one-to-one and one-to-many relationships between content and data (ex. allow passing a list of author IDs in your frontmatter).

## Non-goals

- User-facing APIs to introduce new data collection formats like YAML or TOML. We recognize the value of community plugins to introduce new formats, and we will experiment with a pluggable API internally. Still, a finalized user-facing API will be considered out-of-scope.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Referencing data from content collections #530

Details

Summary

Background & Motivation

Use cases

Goals

Non-goals

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Referencing data from content collections #530

Description

Details

Summary

Background & Motivation

Use cases

Goals

Non-goals

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions