socrata-mcp-server

v0.1.8 pre-1.0

Search and query government open-data portals (Socrata SODA API) via MCP. STDIO or Streamable HTTP.

socrata.caseyjhand.com/mcp
claude mcp add --transport http socrata-mcp-server https://socrata.caseyjhand.com/mcp
codex mcp add socrata-mcp-server --url https://socrata.caseyjhand.com/mcp
{
  "mcpServers": {
    "socrata-mcp-server": {
      "url": "https://socrata.caseyjhand.com/mcp"
    }
  }
}
gemini mcp add --transport http socrata-mcp-server https://socrata.caseyjhand.com/mcp
{
  "mcpServers": {
    "socrata-mcp-server": {
      "command": "bunx",
      "args": [
        "mcp-remote",
        "https://socrata.caseyjhand.com/mcp"
      ]
    }
  }
}
{
  "mcpServers": {
    "socrata-mcp-server": {
      "type": "http",
      "url": "https://socrata.caseyjhand.com/mcp"
    }
  }
}
curl -X POST https://socrata.caseyjhand.com/mcp \
  -H "Content-Type: application/json" \
  -H "MCP-Protocol-Version: 2025-11-25" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-11-25","capabilities":{},"clientInfo":{"name":"curl","version":"1.0.0"}}}'

Tools

6

socrata_find_datasets

open-world

Search for datasets across all Socrata-powered government open-data portals, or scope to one portal with the domain parameter. Returns dataset IDs, names, abbreviated column lists, domains, and update timestamps. Use socrata_get_dataset to fetch the full typed column schema before writing queries — columnNames here are preview-only and lack type information.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "socrata_find_datasets",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "query": {
      "description": "Full-text search across dataset names and descriptions. Omit to browse without filtering.",
      "type": "string"
    },
    "domain": {
      "description": "Scope search to a single portal (e.g. data.seattle.gov, data.cityofnewyork.us). Omit to search all portals.",
      "type": "string"
    },
    "categories": {
      "description": "Filter by domain categories (e.g. [\"Public Safety\", \"Transportation\"]).",
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "tags": {
      "description": "Filter by tags (e.g. [\"covid19\", \"permits\"]).",
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "only": {
      "description": "Filter by asset type. Omit to include all types. Usually \"datasets\" is what you want.",
      "type": "string",
      "enum": [
        "datasets",
        "maps",
        "files",
        "calendars",
        "stories"
      ]
    },
    "order": {
      "description": "Sort order. Defaults to relevance. Use updated_at to surface recently-refreshed datasets.",
      "type": "string",
      "enum": [
        "relevance",
        "page_views_total",
        "created_at",
        "updated_at"
      ]
    },
    "limit": {
      "default": 10,
      "description": "Number of results to return (1–100). Default 10.",
      "type": "integer",
      "minimum": 1,
      "maximum": 100
    },
    "offset": {
      "default": 0,
      "description": "Pagination offset. Default 0.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    }
  },
  "required": [
    "limit",
    "offset"
  ],
  "additionalProperties": false
}
view source ↗

socrata_get_dataset

Fetch full metadata and column schema for a Socrata dataset by ID. Returns field names, data types, descriptions, row count, and licensing. Always call this before writing a socrata_query_dataset — the column types determine correct WHERE clause syntax: Number columns accept bare literals (year=2023) while Text columns require single-quoted strings (year='2023').

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "socrata_get_dataset",
    "arguments": {
      "dataset_id": "<dataset_id>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "domain": {
      "description": "Portal domain (e.g. data.seattle.gov). Defaults to SOCRATA_DEFAULT_DOMAIN env var or data.seattle.gov.",
      "type": "string"
    },
    "dataset_id": {
      "type": "string",
      "description": "Four-by-four dataset ID matching pattern like kzjm-xkqj. Obtain from socrata_find_datasets."
    }
  },
  "required": [
    "dataset_id"
  ],
  "additionalProperties": false
}
view source ↗

socrata_query_dataset

Execute a SoQL query against any dataset on any Socrata portal. Use the search parameter for quick full-text lookup, or combine select/where/group/having/order for full analytical control. Returns rows plus the assembled SoQL string so you can learn the pattern. All SODA 2.1 row values are strings even for numeric columns — check dataType from socrata_get_dataset to determine correct WHERE quoting: Number columns use bare literals (year=2023), Text columns use single-quoted strings (year='2023'). To enumerate distinct values, use select="col, count(*) as n" with group="col" and order="n DESC". When CANVAS_PROVIDER_TYPE=duckdb and rows fill the limit, results spill to a DataCanvas table for SQL-based analysis.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "socrata_query_dataset",
    "arguments": {
      "dataset_id": "<dataset_id>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "domain": {
      "description": "Portal domain (e.g. data.seattle.gov). Defaults to SOCRATA_DEFAULT_DOMAIN or data.seattle.gov.",
      "type": "string"
    },
    "dataset_id": {
      "type": "string",
      "description": "Four-by-four dataset ID (e.g. kzjm-xkqj). Obtain from socrata_find_datasets."
    },
    "search": {
      "description": "Full-text search across all text columns ($q). For field-specific filtering, use where instead.",
      "type": "string"
    },
    "select": {
      "description": "SoQL SELECT clause — column names, aliases, aggregates: \"state, sum(deaths) as total_deaths\". Omit for all columns.",
      "type": "string"
    },
    "where": {
      "description": "SoQL WHERE clause. Check column dataType from socrata_get_dataset first — Number columns: year=2023, Text columns: year='2023'. Operators: =, !=, >, <, LIKE, IN(...), BETWEEN, IS NULL, starts_with(), contains(), AND, OR, NOT.",
      "type": "string"
    },
    "group": {
      "description": "SoQL GROUP BY clause. Requires an aggregate function in select.",
      "type": "string"
    },
    "having": {
      "description": "SoQL HAVING clause. Filters on aggregated results, e.g. count > 100.",
      "type": "string"
    },
    "order": {
      "description": "SoQL ORDER BY clause, e.g. \"total_deaths DESC\" or \"date ASC\".",
      "type": "string"
    },
    "limit": {
      "default": 100,
      "description": "Max rows to return (1–5000). Default 100. Use with offset for pagination.",
      "type": "integer",
      "minimum": 1,
      "maximum": 5000
    },
    "offset": {
      "default": 0,
      "description": "Row offset for pagination. Default 0.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    },
    "canvas_id": {
      "description": "Optional 10-char DataCanvas token from a prior call. Omit on first call when CANVAS_PROVIDER_TYPE=duckdb to mint a fresh canvas. Large result sets spill here automatically.",
      "type": "string"
    }
  },
  "required": [
    "dataset_id",
    "limit",
    "offset"
  ],
  "additionalProperties": false
}
view source ↗

socrata_list_portals

List known Socrata-powered government open-data portals with their domain, organization name, and dataset count. Backed by the Discovery API domains catalog. Filtering is client-side substring match on the query parameter. Use this first when you do not know which portal to target, then pass the domain to socrata_find_datasets.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "socrata_list_portals",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "query": {
      "description": "Keyword to filter portal names or organization names (case-insensitive substring match). Omit to list all portals.",
      "type": "string"
    },
    "limit": {
      "default": 50,
      "description": "Max portals to return (1–200). Default 50.",
      "type": "integer",
      "minimum": 1,
      "maximum": 200
    },
    "offset": {
      "default": 0,
      "description": "Pagination offset. Default 0.",
      "type": "integer",
      "minimum": 0,
      "maximum": 9007199254740991
    }
  },
  "required": [
    "limit",
    "offset"
  ],
  "additionalProperties": false
}
view source ↗

socrata_dataframe_describe

List registered tables in a DataCanvas session — schema, row count, column names, and registration time. Shows what datasets are available for SQL queries via socrata_dataframe_query. Only meaningful when CANVAS_PROVIDER_TYPE=duckdb is set. Use after socrata_query_dataset spills a large result set to canvas.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "socrata_dataframe_describe",
    "arguments": {}
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "canvas_id": {
      "description": "Canvas ID returned from socrata_query_dataset. Omit to list all tables visible in the current session.",
      "type": "string"
    }
  },
  "additionalProperties": false
}
view source ↗

socrata_dataframe_query

Run SELECT-only SQL against a DataCanvas table populated by socrata_query_dataset. DuckDB infers types from spilled data, so numeric columns that SODA returned as strings become queryable with numeric comparisons (year > 2020, amount < 500). Only works when CANVAS_PROVIDER_TYPE=duckdb is set. Use socrata_dataframe_describe to see registered tables and their schemas.

read
invocation
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "socrata_dataframe_query",
    "arguments": {
      "canvas_id": "<canvas_id>",
      "sql": "<sql>"
    }
  }
}
schema
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "type": "object",
  "properties": {
    "canvas_id": {
      "type": "string",
      "description": "Canvas ID returned from socrata_query_dataset or socrata_dataframe_describe."
    },
    "sql": {
      "type": "string",
      "description": "SELECT-only SQL to run against registered canvas tables. DDL, DML, and file-reading functions are rejected. Use table names from socrata_dataframe_describe."
    },
    "limit": {
      "default": 1000,
      "description": "Max rows to return (1–10000). Default 1000.",
      "type": "integer",
      "minimum": 1,
      "maximum": 10000
    }
  },
  "required": [
    "canvas_id",
    "sql",
    "limit"
  ],
  "additionalProperties": false
}
view source ↗

Resources

2

Fetch full metadata and column schema for a Socrata dataset addressable by stable URI. Same payload as socrata_get_dataset. URI format: socrata://datasets/{domain}/{datasetId} (e.g. socrata://datasets/data.seattle.gov/kzjm-xkqj).

uri socrata://datasets/{domain}/{datasetId} mime application/json

List of known Socrata-powered government open-data portals with organization name and dataset count. Backed by the Discovery API domains catalog. Paginated — default 50 per page. Use the domain values with socrata_find_datasets to search a specific portal.

uri socrata://portals mime application/json

Prompts

1

Structured workflow for investigating a civic data question. Guides: discover relevant datasets on the right portal, inspect schemas, query for baseline data, group/aggregate for trends, and synthesize findings with data-freshness caveats.

  • topicrequired — Civic data topic or question to investigate (e.g. "traffic collisions in 2023", "food inspection failures", "311 service requests by neighborhood").
  • portal — Target portal domain if known (e.g. data.seattle.gov). Omit to use socrata_list_portals to find the right portal first.
  • geography — Geographic scope to focus on (e.g. "Seattle", "King County", "New York City"). Used to scope WHERE clauses.