A Hyper MCP plugin that converts web pages to clean, readable Markdown using defuddle.md. It exposes both an MCP tool and MCP resource templates, so AI assistants can fetch and read any web page as Markdown.
- Tool:
defuddle— fetch anyhttp://orhttps://URL and get back clean Markdown with YAML frontmatter - Resource templates:
https://{+url}andhttp://{+url}— read any web page as a Markdown resource - Resource subscription notifications: fires
notify_resource_updatedon every fresh fetch so subscribed clients are informed when content changes - Caching: optional on-disk cache (identical to the context7-plugin cache) keyed by a hash of the URL
- Retry logic: automatic retries with back-off on 429 / 5xx responses
The plugin delegates HTML-to-Markdown conversion to the defuddle.md web service. For any URL you provide, the plugin:
- Validates that the scheme is
httporhttps - Checks the local cache (if enabled) for a previous result — if found, returns it immediately without firing a subscription notification
- Strips the scheme and appends the remainder to
https://defuddle.md/— e.g.https://example.com/pagebecomeshttps://defuddle.md/example.com/page - Caches the result for future requests
- Fires a
notify_resource_updatednotification with the URL as the resource URI, informing any subscribed clients that fresh content is available - Returns the Markdown response (with YAML frontmatter containing title, source URL, word count, etc.)
{
"plugins": {
"defuddle": {
"url": "oci://ghcr.io/hyper-mcp-rs/defuddle-plugin:latest",
"runtime_config": {
"allowed_hosts": ["defuddle.md"]
}
}
}
}For nightly builds:
{
"plugins": {
"defuddle": {
"url": "oci://ghcr.io/hyper-mcp-rs/defuddle-plugin:nightly",
"runtime_config": {
"allowed_hosts": ["defuddle.md"]
}
}
}
}Add /cache to allowed_paths, mapping it to a directory on the host:
{
"plugins": {
"defuddle": {
"url": "oci://ghcr.io/hyper-mcp-rs/defuddle-plugin:latest",
"runtime_config": {
"allowed_hosts": ["defuddle.md"],
"allowed_paths": ["/path/on/host/defuddle-cache:/cache"]
}
}
}
}If the /cache directory is not mounted the plugin will log an info-level message and operate without caching:
Cache directory /cache is not mounted; caching is disabled
By default cached responses expire after 1 day. Customize this with the CACHE_TTL environment variable (value is in days):
{
"plugins": {
"defuddle": {
"url": "oci://ghcr.io/hyper-mcp-rs/defuddle-plugin:latest",
"runtime_config": {
"allowed_hosts": ["defuddle.md"],
"allowed_paths": ["/path/on/host/defuddle-cache:/cache"],
"env_vars": {
"CACHE_TTL": "7"
}
}
}
}
}{
"plugins": {
"defuddle": {
"url": "oci://ghcr.io/hyper-mcp-rs/defuddle-plugin:latest",
"runtime_config": {
"allowed_hosts": ["defuddle.md"],
"allowed_paths": ["/path/on/host/defuddle-cache:/cache"],
"env_vars": {
"CACHE_TTL": "3"
}
}
}
}
}- Entries are stored as JSON files in
/cache, nameddefuddle_{hex_hash}.jsonwhere the hash is derived from the URL string. - Staleness is determined by comparing the file's last-modified time against the configured TTL.
- Only successful responses are cached; errors are never cached.
- The
clear_cachetool can be used to manually invalidate all cached entries. - Non-JSON files in the cache directory are left untouched by
clear_cache.
Fetches a URL and returns the page content as Markdown.
Uses the defuddle.md service to extract the main content from the page, strip away clutter (sidebars, headers, footers, ads, etc.), and convert the result to clean Markdown with YAML frontmatter.
Input Schema:
{
"url": "string (required) — The URL to fetch. Must use the http:// or https:// scheme."
}Example Input:
{
"url": "https://docs.rs/serde/latest/serde/"
}Example Output:
---
title: "serde - Rust"
source: "https://docs.rs/serde/latest/serde/"
domain: "docs.rs"
word_count: 542
---
# Serde
Serde is a framework for **ser**ializing and **de**serializing Rust data structures
efficiently and generically.
...Behavior:
- Returns a
CallToolResultwith a single text content block containing the Markdown - If the URL scheme is not
httporhttps, returns an error result - If the URL has been fetched before and the cache entry is fresh, the cached result is returned — no subscription notification is fired
- On a fresh fetch from defuddle.md, fires
notify_resource_updatedwith the URL as the resource URI (see Resource Subscription Notifications) - Retries up to 3 times on 429 (rate limit) and 5xx (server error) responses, respecting the
Retry-Afterheader when present
Clears the on-disk Markdown cache. Use this when cached results appear stale or outdated.
This tool takes no arguments.
Example Output (success):
Cache cleared successfully (12 entries removed)
Example Output (cache not mounted):
Cache is not enabled (directory not mounted)
The plugin registers two RFC 6570 URI templates that allow MCP clients to read any web page as a Markdown resource.
Matches any HTTPS URL. The {+url} variable uses RFC 6570 reserved expansion, which allows reserved characters like /, ?, #, and & to pass through without percent-encoding.
| Property | Value |
|---|---|
| Name | defuddle-https |
| URI Template | https://{+url} |
| MIME Type | text/markdown |
| Description | Fetch any https URL and return its content as Markdown via defuddle.md |
Matches any HTTP URL. Identical behavior to the HTTPS template.
| Property | Value |
|---|---|
| Name | defuddle-http |
| URI Template | http://{+url} |
| MIME Type | text/markdown |
| Description | Fetch any http URL and return its content as Markdown via defuddle.md |
When an MCP client resolves a resource URI like https://en.wikipedia.org/wiki/Rust_(programming_language):
- The client matches it against the
https://{+url}template - The full URI is passed to the plugin's
read_resourcehandler - The plugin validates the scheme, checks the cache, and calls defuddle.md
- If the content was freshly fetched (not from cache), a
notify_resource_updatednotification is fired with the URI - The result is returned as a
TextResourceContentswithmimeType: text/markdown
The read_resource implementation shares the same validation, fetching, and caching logic as the defuddle tool — the only difference is the return type (ReadResourceResult with TextResourceContents instead of CallToolResult). Both paths go through the same fetch_defuddle_markdown function, so subscription notifications are fired identically regardless of whether the content was requested via the tool or via a resource read.
A client reading https://example.com as a resource receives:
{
"contents": [
{
"uri": "https://example.com",
"mimeType": "text/markdown",
"text": "---\ntitle: \"Example Domain\"\nsource: \"https://example.com\"\nword_count: 16\n---\n\nThis domain is for use in documentation examples without needing permission. Avoid use in operations.\n\n[Learn more](https://iana.org/domains/example)\n"
}
]
}The plugin fires notify_resource_updated whenever it fetches fresh content from defuddle.md. This allows MCP clients that have subscribed to a resource URI to be informed that new content is available.
When notifications fire:
- Every time defuddle.md is called and returns a successful response — whether triggered by the
defuddletool or byread_resource
When notifications do NOT fire:
- Cache hits — if the result is served from the local cache, no notification is sent
- Errors — if the fetch fails (network error, non-2xx status), no notification is sent
- Validation failures — if the URL scheme is rejected, no notification is sent
Notification payload:
The notification carries a ResourceUpdatedNotificationParam with the original URL as the uri field:
{
"uri": "https://example.com/page"
}Note: The plugin does not check whether the content has actually changed compared to a previous fetch. A notification is fired on every fresh (non-cached) successful response. Clients that need to detect actual content changes should compare the new content against their own prior copy.
The plugin uses the defuddle.md web service:
- Base URL:
https://defuddle.md - Usage:
GET https://defuddle.md/{url_without_scheme} - Response:
text/markdownwith YAML frontmatter
For example, fetching https://example.com/page results in a request to https://defuddle.md/example.com/page.
The defuddle.md service returns Markdown with YAML frontmatter containing metadata extracted from the page:
| Field | Type | Description |
|---|---|---|
title |
string | Page title |
author |
string | Author (when available) |
published |
string | Publication date (when available) |
source |
string | Original URL |
domain |
string | Domain name (when available) |
description |
string | Page description / summary (when available) |
word_count |
number | Word count of the extracted content |
Build the WASM plugin:
cargo build --release --target wasm32-wasip1The compiled plugin will be available at target/wasm32-wasip1/release/plugin.wasm.
The plugin includes a comprehensive test suite with 132 tests across three test files. Because this is a WASM project (compiled for wasm32-wasip1), the tests must be run with an explicit native target:
# Run all tests
cargo test --target $(rustc -vV | grep host | cut -d' ' -f2)
# With output visible
cargo test --target $(rustc -vV | grep host | cut -d' ' -f2) -- --nocaptureOr run individual test suites:
# URL validation and scheme-stripping logic (64 tests, no network)
cargo test --test url_validation_tests --target $(rustc -vV | grep host | cut -d' ' -f2)
# Cache functionality (42 tests, no network)
cargo test --test cache_tests --target $(rustc -vV | grep host | cut -d' ' -f2)
# Live API integration tests (26 tests, requires network)
cargo test --test defuddle_api_tests --target $(rustc -vV | grep host | cut -d' ' -f2)Or specify your target explicitly:
cargo test --target aarch64-apple-darwin # macOS ARM
cargo test --target x86_64-apple-darwin # macOS Intel
cargo test --target x86_64-unknown-linux-gnu # LinuxTests verify:
- ✅
http://andhttps://URLs accepted (with paths, query strings, fragments, ports, userinfo, encoded characters, subdomains) - ✅ Non-HTTP schemes rejected (
ftp,file,ssh,data,javascript,mailto,ws,wss) - ✅ Malformed input rejected (empty strings, bare hostnames, gibberish)
- ✅ Error messages contain useful context (rejected scheme name, "Invalid URL" for unparseable input)
- ✅ Scheme stripping preserves the rest of the URL exactly
- ✅ Case sensitivity (only lowercase
http:///https://are stripped) - ✅ API URL construction produces correct
https://defuddle.md/{path}output - ✅ Real-world URLs (GitHub, Wikipedia, YouTube, docs.rs, localhost, IP addresses, Unicode domains)
Tests verify:
- ✅ Hash determinism (same URL → same hash, different URLs → different hashes)
- ✅
http://vshttps://produce different cache keys - ✅ Cache key from
DefuddleArgumentsmatches plainStringhash (as used in production) - ✅ Cache path format:
{tool_name}_{hex_hash}.json - ✅
CallToolResultserialization round-trip (text, structured, error, Markdown with frontmatter) - ✅ Cache put/get: basic hit, Markdown preservation, overwrite, multi-URL storage
- ✅ Cache misses: empty directory, different URL, different tool name
- ✅ TTL / staleness: fresh entries returned, stale entries rejected, zero-TTL always stale
- ✅ Cache clear: removes
.jsonfiles only, leaves non-JSON files, supports put-after-clear - ✅ Corrupted files: garbage data, empty files, wrong JSON shape, truncated JSON — all handled gracefully
Tests verify:
- ✅ defuddle.md returns
text/markdowncontent type - ✅ Response contains YAML frontmatter with
titlefield - ✅ Real-world pages work (Wikipedia, GitHub, rust-lang.org)
- ✅ Nonexistent domains handled gracefully
- ✅ API response wraps into
CallToolResultand survives JSON round-trip (for caching) - ✅ Full pipeline: validate → strip → build API URL → fetch
- ✅ Sequential requests return identical content (idempotency)
- ✅ Markdown output is clean (no raw HTML tags)
- ✅ RFC 6570 resource template patterns capture URL components correctly
See tests/README.md for detailed test documentation.
# Check formatting
cargo fmt -- --check
# Run clippy
cargo clippy -- -D warningsThe CI workflow runs on every push to main and on pull requests:
- clippy — lint checks with
-D warnings - fmt — formatting check
- build —
cargo build --release --target wasm32-wasip1
defuddle-plugin/
├── src/
│ ├── lib.rs # Plugin entry points: call_tool, list_tools, list_resource_templates, read_resource
│ ├── cache.rs # On-disk cache (mirrors context7-plugin's cache module)
│ ├── types.rs # DefuddleArguments, ClearCacheArguments
│ └── pdk/ # Auto-generated PDK bindings (types, imports, exports)
├── tests/
│ ├── url_validation_tests.rs # URL validation + scheme stripping (64 tests)
│ ├── cache_tests.rs # Cache logic (42 tests)
│ ├── defuddle_api_tests.rs # Live API integration (26 tests)
│ └── README.md # Test documentation
├── Cargo.toml
├── Dockerfile
└── README.md
Apache License 2.0 — see LICENSE for details.