Skip to content

cyanheads/wikipedia-mcp-server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

@cyanheads/wikipedia-mcp-server

Search Wikipedia articles, read summaries and full text, target sections, find nearby pages, and list language editions via MCP. STDIO or Streamable HTTP.

6 Tools

Version License Docker MCP SDK npm TypeScript Bun

Install in Claude Desktop Install in Cursor Install in VS Code

Framework


Tools

Six tools for working with Wikipedia across all language editions:

Tool Description
wikipedia_search Full-text search across Wikipedia, returning ranked results with plain-text snippets and page IDs.
wikipedia_get_summary Lead-section summary for any article — plain text, Wikidata QID, description, thumbnail URL, and page type.
wikipedia_get_article Full article or a targeted section as clean plain text, with section markers preserved.
wikipedia_get_sections Table of contents with section_index values for targeted section reads.
wikipedia_search_nearby Geotagged Wikipedia articles within a radius of a WGS 84 coordinate, sorted by distance.
wikipedia_get_languages All language editions available for an article, with titles and URLs.

wikipedia_search

Search Wikipedia articles by full-text query.

  • Returns ranked results with plain-text snippets (HTML stripped), page IDs, and word counts
  • Use when the exact article title is unknown or to discover multiple articles on a topic
  • Supports all Wikipedia language editions via the language parameter

wikipedia_get_summary

Fetch the lead-section summary for a Wikipedia article.

  • Returns the 2–4 paragraph intro, Wikidata QID for cross-referencing, short description, and thumbnail URL
  • Surfaces page_type: "disambiguation" — a signal to follow up with wikipedia_search using a more specific query
  • Redirect pages followed automatically
  • Right tool for 90% of encyclopedic lookups

wikipedia_get_article

Fetch article content as clean plain text.

  • Without section_index: returns the full article (40–100 KB for major articles) with == Section == markers
  • With section_index (from wikipedia_get_sections): returns just that section (1–10 KB)
  • Section path uses wikitext stripping via wtf_wikipedia

wikipedia_get_sections

Fetch the table of contents for a Wikipedia article.

  • Returns section titles, heading levels, section numbering (e.g. "2.1"), and section_index values
  • section_index is the integer to pass to wikipedia_get_article for targeted reads
  • Call this before wikipedia_get_article when only a specific section is needed

wikipedia_search_nearby

Find Wikipedia articles about places near a geographic coordinate.

  • Results sorted ascending by distance in meters
  • Only articles with geographic coordinates in their Wikidata record are returned
  • Radius capped at 10,000 meters; up to 50 results per call

wikipedia_get_languages

List language editions available for a Wikipedia article.

  • Returns language codes, article titles in each language, and full URLs
  • Use for cross-language research or to discover a non-English title before switching editions

Features

Built on @cyanheads/mcp-ts-core:

  • Declarative tool definitions — single file per tool, framework handles registration and validation
  • Unified error handling — handlers throw, framework catches, classifies, and formats
  • Pluggable auth: none, jwt, oauth
  • Swappable storage backends: in-memory, filesystem, Supabase, Cloudflare KV/R2/D1
  • Structured logging with optional OpenTelemetry tracing
  • STDIO and Streamable HTTP transports

Wikipedia-specific:

  • Dual API integration — MediaWiki REST API (/api/rest_v1/) for summaries, Action API (/w/api.php) for search, full text, sections, geo search, and language links
  • Retry and backoff on all requests; User-Agent header per Wikimedia API policy
  • Wikitext stripping pipeline via wtf_wikipedia — handles links, templates, refs, bold/italic; re-injects section headings for structure
  • Per-call language parameter on every tool — all Wikipedia language editions accessible in a single session
  • Language validation against ~250 known Wikipedia edition codes — catches structurally valid but nonexistent editions before they cause timeouts

Agent-friendly output:

  • page_type field on summaries discriminates article / disambiguation / redirect — no string parsing needed
  • wikibase_item (Wikidata QID) on summaries enables direct cross-referencing with wikidata-mcp-server
  • section_index on table-of-contents entries links directly to the targeted-read parameter on wikipedia_get_article
  • Recovery hints on every error type — callers get actionable next steps (e.g., "use wikipedia_search to find the correct title")

Getting started

Add the following to your MCP client configuration file.

{
  "mcpServers": {
    "wikipedia-mcp-server": {
      "type": "stdio",
      "command": "bunx",
      "args": ["@cyanheads/wikipedia-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with npx (no Bun required):

{
  "mcpServers": {
    "wikipedia-mcp-server": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@cyanheads/wikipedia-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with Docker:

{
  "mcpServers": {
    "wikipedia-mcp-server": {
      "type": "stdio",
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "-e", "MCP_TRANSPORT_TYPE=stdio",
        "ghcr.io/cyanheads/wikipedia-mcp-server:latest"
      ]
    }
  }
}

For Streamable HTTP, set the transport and start the server:

MCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 bun run start:http
# Server listens at http://localhost:3010/mcp

Prerequisites

  • Bun v1.3.0 or higher (or Node.js v24+).
  • No API keys required — Wikipedia's API is public.

Installation

  1. Clone the repository:
git clone https://github.com/cyanheads/wikipedia-mcp-server.git
  1. Navigate into the directory:
cd wikipedia-mcp-server
  1. Install dependencies:
bun install
  1. Configure environment (optional):
cp .env.example .env
# edit .env if you want to customize WIKIPEDIA_USER_AGENT or logging

Configuration

Variable Description Default
WIKIPEDIA_USER_AGENT User-Agent header sent with every Wikimedia API request. Customize for your deployment. wikipedia-mcp-server/0.1.9 (https://github.com/cyanheads/wikipedia-mcp-server)
WIKIPEDIA_BASE_URL Base Wikipedia URL. Language selection is per-call — not a global language setting. https://en.wikipedia.org
MCP_TRANSPORT_TYPE Transport: stdio or http. stdio
MCP_HTTP_PORT Port for HTTP server. 3010
MCP_AUTH_MODE Auth mode: none, jwt, or oauth. none
MCP_LOG_LEVEL Log level (RFC 5424). info
LOGS_DIR Directory for log files (Node.js only). <project-root>/logs
OTEL_ENABLED Enable OpenTelemetry instrumentation (spans, metrics, completion logs). false

See .env.example for the full list of optional overrides.

Running the server

Local development

  • Build and run:

    # One-time build
    bun run rebuild
    
    # Run the built server
    bun run start:stdio
    # or
    bun run start:http
  • Run checks and tests:

    bun run devcheck   # Lint, format, typecheck, security
    bun run test       # Vitest test suite
    bun run lint:mcp   # Validate MCP definitions against spec

Docker

docker build -t wikipedia-mcp-server .
docker run --rm -p 3010:3010 wikipedia-mcp-server

The Dockerfile defaults to HTTP transport, stateless session mode, and logs to /var/log/wikipedia-mcp-server. OpenTelemetry peer dependencies are installed by default — build with --build-arg OTEL_ENABLED=false to omit them.

Project structure

Directory Purpose
src/index.ts createApp() entry point — registers tools and inits the Wikipedia service.
src/config Server-specific environment variable parsing and validation with Zod.
src/mcp-server/tools Tool definitions (*.tool.ts) — one file per tool.
src/services/wikipedia WikipediaService — REST API + Action API client with retry/backoff and language validation.
tests/ Unit and integration tests mirroring src/.

Development guide

See CLAUDE.md for development guidelines and architectural rules. The short version:

  • Handlers throw, framework catches — no try/catch in tool logic
  • Use ctx.log for request-scoped logging, ctx.state for tenant-scoped storage
  • Register new tools in src/mcp-server/tools/definitions/index.ts
  • Wrap external API calls: validate raw → normalize to domain type → return output schema; never fabricate missing fields

Contributing

Issues and pull requests are welcome. Run checks and tests before submitting:

bun run devcheck
bun run test

License

Apache-2.0 — see LICENSE for details.

About

Search Wikipedia articles, read summaries and full text, target sections, find nearby pages, and list language editions via MCP. STDIO or Streamable HTTP.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors