Search and fetch Wikidata entities, execute SPARQL queries, and resolve external identifiers via MCP. STDIO or Streamable HTTP.
Public Hosted Server: https://wikidata.caseyjhand.com/mcp
7 tools for working with Wikidata's knowledge graph:
| Tool | Description |
|---|---|
wikidata_search_entities |
Search for items or properties by text query, returning QIDs/PIDs with labels, descriptions, and match metadata |
wikidata_get_entity |
Fetch a full entity by QID or PID with optional field and language filtering |
wikidata_get_labels |
Batch-resolve up to 50 QIDs or PIDs to human-readable labels and descriptions |
wikidata_get_statements |
Fetch property claims for an entity with qualifier detail and QID label resolution |
wikidata_get_sitelinks |
Fetch Wikipedia and Wikimedia project article URLs for a Wikidata item |
wikidata_sparql_query |
Execute a SPARQL SELECT query against the Wikidata Query Service |
wikidata_resolve_external_id |
Look up a Wikidata entity by an external identifier (DOI, PubMed ID, ORCID, OpenAlex ID, etc.) |
Search Wikidata for items or properties by text query.
- Searches labels, aliases, and descriptions
type="item"for real-world concepts (people, places, works);type="property"for predicate P-IDs- Language-aware results (BCP 47 language codes)
- Offset-based pagination, up to 50 results per call
- Returns match metadata indicating whether the hit was on a label or alias
Fetch a Wikidata entity by QID or PID with field selection.
- Q-IDs (e.g.
Q76) fetch items; P-IDs (e.g.P31) fetch properties — endpoint routing is automatic fieldsparameter trims the response tolabels,descriptions,aliases,statements, orsitelinkslanguagesparameter filters multilingual maps to specific language codes- Full entity payload always fetched from the API; field/language filtering is client-side
Batch-resolve QIDs/PIDs to human-readable labels and descriptions.
- Up to 50 IDs per call, batched via the MediaWiki
wbgetentitiesAPI - Supports multiple language codes per request
- Reports
foundcount andnotFoundIDs for partial-result handling - Designed for the common agent pattern: run a SPARQL query, then humanize the QID results
Fetch property claims for a Wikidata entity with full qualifier and reference detail.
propertiesparameter fetches only specific P-IDs — omit to return all statements- Value QIDs are resolved to human-readable labels by default via a batched label call
- Set
resolve_labels=falsefor raw QIDs only (faster, smaller payload) - Preferred-rank statements represent the most current values
- Designed for fact verification: "what does Wikidata say about this entity's {property}?"
Fetch Wikipedia and Wikimedia project article URLs for a Wikidata item.
- Maps site codes (e.g.,
enwiki) to article titles and URLs sitesparameter filters to specific site codeswikis_only=truereturns only Wikipedia links (excludes Wiktionary, Wikiquote, Wikisource, etc.)- Major items can have 300+ sitelinks across languages
- Only Q-IDs (items) have sitelinks — P-IDs are not supported
Execute a SPARQL SELECT query against the Wikidata Query Service (Blazegraph).
- Full graph power: multi-hop traversals, aggregations, subqueries, OPTIONAL, FILTER, UNION, BIND
- Standard Wikidata prefixes (
wd:,wdt:,p:,ps:,pq:,wikibase:,bd:) are auto-injected wikibase:labelSERVICE auto-injected whenlanguageis set and the query uses?<var>Labelvariables- Results in SPARQL 1.1 JSON format: each binding is
{ type, value, "xml:lang"? } - Hard server timeout is 60s; client-side
timeoutparameter (1–55s) applies earlier - Rate-limited at 60 requests/min and 5 concurrent requests per IP
Look up a Wikidata entity by an external identifier.
- Common use cases: CrossRef DOI → QID (P356), PubMed PMID → QID (P698), ORCID → author QID (P496), OpenAlex ID → entity QID (P10283), IMDb ID (P345)
- Automatic value normalization: DOIs uppercased, PMID prefixes stripped, ORCID hyphens normalized
- Returns
match=nullwhen not found - Returns
multipleMatcheswhen a Wikidata data integrity issue causes more than one entity to claim the same external ID - Designed for cross-server joins with pubmed-mcp-server, crossref-mcp-server, and openalex-mcp-server
| Type | Name | Description |
|---|---|---|
| Resource | wikidata://entity/{id} |
Compact markdown summary of a Wikidata entity — labels, English description, instance-of, Wikipedia link, image, and statement count |
All resource data is also reachable via tools.
Built on @cyanheads/mcp-ts-core:
- Declarative tool definitions — single file per tool, framework handles registration and validation
- Unified error handling across all tools
- Pluggable auth (
none,jwt,oauth) - Swappable storage backends:
in-memory,filesystem,Supabase,Cloudflare KV/R2/D1 - Structured logging with optional OpenTelemetry tracing
- Runs locally (stdio/HTTP) from the same codebase
Wikidata-specific:
- Wikidata REST API v1 for entity and statement fetches — no SPARQL overhead for lookup operations
- MediaWiki
wbgetentitiesAPI for efficient batch label resolution - Wikidata Query Service (Blazegraph) for SPARQL with auto-injected prefix headers and label SERVICE
- Configurable
User-Agentper Wikimedia policy - Separate timeout configuration for REST and SPARQL endpoints
Agent-friendly output:
wikidata_get_labelsdesigned to follow SPARQL result sets — run the query, then humanize in one callwikidata_resolve_external_idhandles DOI/PMID/ORCID normalization transparently, withmultipleMatchesfor data integrity edge caseswikidata_get_statementsresolves QID values to labels in the same call, withresolve_labels=falseescape hatch for raw payloads- All tools echo input parameters in the response for traceability
Add the following to your MCP client configuration file.
{
"mcpServers": {
"wikidata-mcp-server": {
"type": "stdio",
"command": "bunx",
"args": ["@cyanheads/wikidata-mcp-server@latest"],
"env": {
"MCP_TRANSPORT_TYPE": "stdio",
"MCP_LOG_LEVEL": "info"
}
}
}
}Or with npx (no Bun required):
{
"mcpServers": {
"wikidata-mcp-server": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@cyanheads/wikidata-mcp-server@latest"],
"env": {
"MCP_TRANSPORT_TYPE": "stdio",
"MCP_LOG_LEVEL": "info"
}
}
}
}Or with Docker:
{
"mcpServers": {
"wikidata-mcp-server": {
"type": "stdio",
"command": "docker",
"args": ["run", "-i", "--rm", "-e", "MCP_TRANSPORT_TYPE=stdio", "ghcr.io/cyanheads/wikidata-mcp-server:latest"]
}
}
}For Streamable HTTP, set the transport and start the server:
MCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 bun run start:http
# Server listens at http://localhost:3010/mcp- Bun v1.3.0 or higher (or Node.js ≥ 24.0.0).
- Clone the repository:
git clone https://github.com/cyanheads/wikidata-mcp-server.git- Navigate into the directory:
cd wikidata-mcp-server- Install dependencies:
bun installAll configuration is validated at startup via Zod schemas. Key environment variables:
| Variable | Description | Default |
|---|---|---|
MCP_TRANSPORT_TYPE |
Transport: stdio or http |
stdio |
MCP_HTTP_PORT |
HTTP server port | 3010 |
MCP_HTTP_ENDPOINT_PATH |
HTTP endpoint path where the MCP server is mounted | /mcp |
MCP_PUBLIC_URL |
Public origin override for TLS-terminating reverse-proxy deployments | none |
MCP_AUTH_MODE |
Authentication: none, jwt, or oauth |
none |
MCP_LOG_LEVEL |
Log level (debug, info, notice, warning, error) |
info |
MCP_GC_PRESSURE_INTERVAL_MS |
Opt-in Bun-only forced-GC interval (ms). Try 60000 if heap growth is observed under HTTP load. |
0 (disabled) |
LOGS_DIR |
Directory for log files (Node.js only) | <project-root>/logs |
STORAGE_PROVIDER_TYPE |
Storage backend: in-memory, filesystem, supabase, cloudflare-kv/r2/d1 |
in-memory |
WIKIDATA_USER_AGENT |
User-Agent string for Wikimedia requests (policy requires a descriptive value) | wikidata-mcp-server/0.1 (https://github.com/cyanheads/wikidata-mcp-server) |
WIKIDATA_SPARQL_TIMEOUT_MS |
Max wait for a SPARQL response in ms | 55000 |
WIKIDATA_REST_TIMEOUT_MS |
Max wait for REST API responses in ms | 10000 |
OTEL_ENABLED |
Enable OpenTelemetry | false |
-
Build and run the production version:
# One-time build bun run rebuild # Run the built server bun run start:http # or bun run start:stdio
-
Run checks and tests:
bun run devcheck # Lints, formats, type-checks, and more bun run test # Runs the test suite
| Directory | Purpose |
|---|---|
src/mcp-server/tools |
Tool definitions (*.tool.ts). Seven tools for entity lookup, statements, sitelinks, SPARQL, and external ID resolution. |
src/mcp-server/resources |
Resource definitions. Entity summary resource. |
src/services/wikidata |
Wikidata service layer — REST API client, SPARQL client, statement normalization, types. |
src/config |
Server-specific environment variable parsing and validation with Zod. |
tests/ |
Unit and integration tests, mirroring the src/ structure. |
See CLAUDE.md for development guidelines and architectural rules. The short version:
- Handlers throw, framework catches — no
try/catchin tool logic - Use
ctx.logfor logging,ctx.statefor storage - Register new tools and resources in the
createApp()arrays
Issues and pull requests are welcome. Run checks and tests before submitting:
bun run devcheck
bun run testApache-2.0 — see LICENSE for details.