databricks-bdd-tools

A Claude Code plugin for BDD testing with Python Behave and Databricks. Generates Gherkin feature files, step definitions, and test suites for pipelines, Unity Catalog, Apps, and jobs.

Why BDD for data pipelines?

Data teams rarely build test suites. The cost of setting up a test harness, learning a framework, and maintaining tests alongside evolving schemas was never justifiable against shipping the next feature.

BDD with Gherkin changes the equation:

Readable by everyone — compliance officers, regulators, and engineers can all read Given/When/Then scenarios
Agent-friendly — AI agents produce better test coverage from structured Gherkin than from freeform requirements
Test-production parity — BDD tests call real Unity Catalog functions via the Statement Execution API, the same functions your pipeline uses

For more context, see The test suite nobody had to write: agentic BDD for data pipelines.

Skills

Skill	Description
`bdd-scaffold`	Set up a complete Behave project with Databricks SDK wiring, ephemeral test schemas, and Makefile targets
`bdd-features`	Generate Gherkin feature files from requirements, code, or user stories
`bdd-steps`	Implement Python step definitions that call Databricks SDK / Statement Execution API
`bdd-run`	Execute Behave with tag filtering, parallel execution, JUnit/JSON reporting, and failure diagnosis

Installation

Add to your Claude Code settings (.claude/settings.json):

{
  "plugins": [
    {
      "source": {
        "source": "github",
        "repo": "dgokeeffe/databricks-bdd-tools",
        "ref": "main"
      }
    }
  ]
}

Quick start

Scaffold — Ask Claude Code to "set up BDD for my Databricks project"
Write features — "write Gherkin tests for my pricing compliance rules"
Generate steps — "implement step definitions for the new feature file"
Run — "run BDD smoke tests"

Architecture

Gherkin Feature Files (.feature)
  |
  v
Behave Step Definitions (.py)
  |
  v
call_rule() / Databricks SDK
  |
  v
Statement Execution API
  |
  v
Unity Catalog SQL Functions
  |
  v
Same functions used by production pipeline

The key insight: SQL functions in Unity Catalog are the single source of truth. BDD tests call them via the Statement Execution API. The production pipeline (Lakeflow Spark Declarative Pipelines) calls the same functions in materialized views. No drift.

Test suite

The test-suite/ directory contains a working proof-of-concept with:

Feature files for Unity Catalog schema operations and SQL data operations
Step definitions using databricks-sdk and Statement Execution API
environment.py with ephemeral schema lifecycle management
behave.ini configuration

Run with:

cd test-suite
uv run behave --tags="@smoke" --format=pretty

Prerequisites

Python 3.10+
uv for package management
databricks-sdk and behave (installed via uv add --group test behave databricks-sdk)
Authenticated Databricks CLI profile or environment variables
A SQL warehouse (auto-discovered if not specified)

References

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude-plugin		.claude-plugin
skills		skills
test-suite		test-suite
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

databricks-bdd-tools

Why BDD for data pipelines?

Skills

Installation

Quick start

Architecture

Test suite

Prerequisites

References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

databricks-bdd-tools

Why BDD for data pipelines?

Skills

Installation

Quick start

Architecture

Test suite

Prerequisites

References

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages