Skip to content

brandonhimpfen/code-block-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

code-block-extractor

Extract fenced code blocks from Markdown.

This package provides a small, dependency-free way to find and extract fenced code blocks from Markdown content. It is useful for content pipelines, documentation tooling, static site workflows, and any system that needs to inspect or reuse code examples embedded in Markdown.

Why this project exists

Markdown is widely used across documentation, notes, blog posts, and developer tooling. In many workflows, code blocks are not just presentation. They are data.

You may need to:

  • collect all code examples from a document.
  • detect which languages are used.
  • run checks against embedded snippets.
  • transform code samples into structured output.

This package makes that straightforward.

Install

npm install code-block-extractor

Example

import { extractCodeBlocks } from "code-block-extractor";

const markdown = `
# Example

Here is JavaScript:

\`\`\`js
console.log("Hello, world!");
\`\`\`

And here is Python:

\`\`\`python
print("Hello, world!")
\`\`\`
`;

const blocks = extractCodeBlocks(markdown);

console.log(blocks);

Result:

[
  {
    index: 0,
    language: "js",
    code: "console.log(\"Hello, world!\");",
    fence: "```",
    raw: "```js\nconsole.log(\"Hello, world!\");\n```"
  },
  {
    index: 1,
    language: "python",
    code: "print(\"Hello, world!\")",
    fence: "```",
    raw: "```python\nprint(\"Hello, world!\")\n```"
  }
]

API

extractCodeBlocks(markdown, options?)

Returns an array of extracted code blocks.

Parameters

  • markdown (string) - The Markdown content to inspect.
  • options (object, optional)
    • includeRaw (boolean, default true) - Include the raw fenced block in the result.
    • trim (boolean, default true) - Trim leading and trailing newlines from extracted code.
    • languages (string[], optional) - Only return code blocks matching the given languages.

Returns

An array of objects with:

  • index - Zero-based index of the extracted block.
  • language - Language info string, normalized to lowercase, or null.
  • code - Extracted code.
  • fence - Fence marker used, such as ````` or ~~~.
  • raw - Raw fenced block, if includeRaw is enabled.

Design notes

This parser focuses on standard fenced Markdown code blocks using backticks or tildes. It is intentionally lightweight and does not attempt to fully parse Markdown documents or handle every edge case from every Markdown flavor.

The goal is a practical extractor that is easy to understand and easy to use.

Example use cases

  • documentation analysis.
  • snippet extraction for testing.
  • static site content pipelines.
  • language usage audits in Markdown files.
  • building code example galleries.

License

MIT

About

Extract fenced code blocks from Markdown.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors