Merge API Documentation

Authentication: OAuth or personal access token. See Magic Link for the runtime auth flow, or Application credentials to bring your own OAuth app.

Sample use cases

List the 10 most recent failed jobs in the prod workspace and surface the error.
Run the nightly_etl job and notify #data-eng when it finishes.
Show me clusters with idle time >2 hours so we can shut them down.

Available Tools

list_clusters

List all clusters in the Databricks workspace. Returns cluster IDs, names, states, and configurations. Use this to find cluster IDs for other operations.

get_cluster

Get detailed information about a specific Databricks cluster including state, configuration, and resource allocation. Use list_clusters first to find the cluster ID.

create_cluster

Create a new Databricks cluster. Requires cluster name, Spark version, and node type. Specify num_workers for fixed size or autoscale_min/max_workers for autoscaling.

start_cluster

Start a terminated Databricks cluster. The cluster must be in TERMINATED state. Use list_clusters to find clusters and their states.

terminate_cluster

Terminate a running Databricks cluster. This stops the cluster but preserves its configuration for restarting. Use list_clusters to find cluster IDs.

list_jobs

List jobs in the Databricks workspace with optional name filter and pagination. Returns job IDs, names, and settings. Use page_token from response for next page.

get_job

Get detailed information about a specific Databricks job including tasks, schedule, and configuration. Use list_jobs first to find the job ID.

create_job

Create a new Databricks job with one or more tasks. Each task needs a task_key and type (notebook_task, spark_python_task, sql_task, etc). Supports scheduling with cron expressions.

delete_job

Permanently delete a Databricks job. This also cancels any active runs. Use list_jobs to find the job ID.

run_job_now

Trigger an immediate run of a Databricks job. Optionally pass notebook_params or python_named_params to override defaults. Use list_jobs to find the job ID.

list_job_runs

List job runs in the Databricks workspace. Filter by job_id, active_only, or completed_only. Supports offset/limit pagination. Returns run IDs, states, and timing info.

get_job_run

Get detailed information about a specific job run including state, timing, and task details. Use list_job_runs to find the run ID.

cancel_job_run

Cancel an active job run. The run must be in PENDING or RUNNING state. Use list_job_runs with active_only=true to find cancellable runs.

get_job_run_output

Get the output of a completed job run including notebook results, SQL output, logs, and error traces. Use list_job_runs to find the run ID.

execute_sql_statement

Execute a SQL statement on a Databricks SQL warehouse. Returns results synchronously within wait_timeout (default 10s) or a statement_id for async polling via get_sql_statement.

get_sql_statement

Get the status and results of a SQL statement execution. Use this to poll for results of async statements started with execute_sql_statement.

cancel_sql_statement

Cancel a running SQL statement execution. Use get_sql_statement first to verify the statement is still in PENDING or RUNNING state.

list_sql_warehouses

List all SQL warehouses in the Databricks workspace. Returns warehouse IDs, names, sizes, and states. Use this to find warehouse IDs for SQL execution.

get_sql_warehouse

Get detailed information about a specific SQL warehouse including state, size, cluster count, and active sessions. Use list_sql_warehouses to find the warehouse ID.

create_sql_warehouse

Create a new Databricks SQL warehouse. Requires a name and cluster_size (T-shirt sizing from 2X-Small to 4X-Large). Optionally configure autoscaling and auto-stop.

start_sql_warehouse

Start a stopped SQL warehouse. The warehouse must be in STOPPED state. Use list_sql_warehouses to find warehouses and their states.

stop_sql_warehouse

Stop a running SQL warehouse. This deallocates compute resources. Use list_sql_warehouses to find warehouses and their states.

list_workspace

List objects in a Databricks workspace directory. Returns notebooks, directories, files, repos, and libraries at the given path. Use ’/’ for the root directory.

get_workspace_object_status

Get metadata about a workspace object including type, language (for notebooks), and timestamps. Use list_workspace to find valid paths.

delete_workspace_object

Delete a workspace object (notebook, file, or directory). For non-empty directories, set recursive=true. Use list_workspace to find valid paths.

ask_genie

Ask a natural language question about your data using Databricks Genie. Provide either space_id or space_name to identify the Genie room.

create_vector_search_endpoint

Create a vector search endpoint to host vector search indexes. An endpoint must exist before creating indexes on it.

create_vector_search_index

Create a vector search index on an endpoint. Use DELTA_SYNC to auto-sync from a Delta table, or DIRECT_ACCESS for manual vector upserts.

delete_vector_search_endpoint

Delete a vector search endpoint. All indexes on the endpoint must be deleted first.

delete_vector_search_index

Delete a vector search index. This permanently removes the index and its data. Use query_vector_index to verify the index before deleting.

execute_sql_statement_readonly

Execute a read-only SQL statement on a Databricks SQL warehouse (SELECT, WITH, SHOW, DESCRIBE, EXPLAIN, VALUES, LIST).

export_notebook

Export a notebook’s content from the workspace. Returns base64-encoded content in the specified format (SOURCE, HTML, JUPYTER, DBC).

get_cluster_events

Get events for a cluster to diagnose issues. Returns creation, termination, resizing, errors, and driver events. Filter by time range or event type.

get_current_user

Get the current authenticated Databricks user and their workspace home directory. Returns user ID, username, display name, and home path (e.g.

get_genie_message

Get the status and results of a Genie message. Use this to poll for results when ask_genie returns status EXECUTING_QUERY.

get_table_info

Get detailed information about a table including column definitions, types, and properties. Provide the full three-level name (catalog.schema.table).

import_notebook

Create or overwrite a notebook in the workspace. Content must be base64-encoded. For SOURCE format, specify the language (PYTHON, SCALA, SQL, R).

list_catalogs

List all catalogs in Unity Catalog. Returns catalog names, owners, types, and descriptions.

list_genie_spaces

List available Genie spaces (rooms). Returns space IDs, names, and descriptions. Use this to find a space before asking questions with ask_genie.

list_schemas

List schemas within a Unity Catalog catalog. Returns schema names, owners, and descriptions. Use list_catalogs first to find valid catalog names.

list_serving_endpoints

List all model serving endpoints in the workspace. Returns endpoint names, states, served models, and configuration.

list_tables

List tables within a Unity Catalog schema. Returns table names, types, formats, and owners.

query_serving_endpoint

Query a model serving endpoint for predictions or chat completions. Automatically detects Foundation Model API endpoints (chat, completions, embedding…

query_vector_index

Query a vector search index using text or a vector. Returns the most similar documents with scores. Supports filtering and column selection.

rename_workspace_file

Rename or move a workspace file or notebook to a new path. Returns destination_url for the new location.

repair_run

Re-run failed tasks in a completed job run without re-running succeeded ones. Set rerun_all_failed_tasks=true to retry all failures, or specify indivi…

run_notebook

Submit a one-time notebook run. Requires either warehouse_id (SQL warehouse) or existing_cluster_id (cluster) for compute.