Skip to content

html-to-markdown

High-performance HTML to Markdown conversion powered by Rust

Convert HTML to clean, readable Markdown at 150--280 MB/s. A single Rust core with native bindings for 12 language ecosystems, delivering identical output across every runtime.


Key Features

Feature Description
Blazing Fast 150--280 MB/s throughput, 10--80x faster than pure Python alternatives
Polyglot 12 native bindings -- Rust, Python, TypeScript, Ruby, PHP, Go, Java, C#, Elixir, R, C, WASM
Smart Conversion Nested tables, code blocks, task lists, hOCR, and complex HTML structures
Metadata Extraction v2.13.0 Title, description, headers, links, images, Open Graph, JSON-LD, Microdata
Visitor Pattern v2.23.0 Custom callbacks for content filtering, URL rewriting, and domain-specific dialects
Secure by Default Built-in HTML sanitization powered by ammonia prevents malicious content

Quick Install

pip install html-to-markdown
npm install @kreuzberg/html-to-markdown-node
cargo add html-to-markdown-rs
gem install html-to-markdown
composer require kreuzberg-dev/html-to-markdown
cargo install html-to-markdown-cli

Or via Homebrew:

brew install kreuzberg-dev/tap/html-to-markdown

Quick Example

from html_to_markdown import convert

html = "<h1>Hello</h1><p>This is <strong>fast</strong>!</p>"
markdown = convert(html)
import { convert } from '@kreuzberg/html-to-markdown';

const markdown: string = convert('<h1>Hello World</h1>');
console.log(markdown); // # Hello World
use html_to_markdown_rs::convert;

let html = "<h1>Hello</h1><p>This is <strong>fast</strong>!</p>";
let markdown = convert(html, None)?;

Live Demo

Try html-to-markdown directly in your browser -- no installation required. The demo runs entirely client-side using the WebAssembly build.

Try the Live Demo


Part of the Kreuzberg Ecosystem

html-to-markdown powers the HTML conversion pipeline in kreuzberg, a document intelligence library for extracting text and structured data from any document format. If you need to process PDFs, DOCX, images, or other document types, check out kreuzberg -- it uses html-to-markdown internally for all HTML-to-Markdown conversion.


Explore the Docs