A Claude Code plugin for BDD testing with Python Behave and Databricks. Generates Gherkin feature files, step definitions, and test suites for pipelines, Unity Catalog, Apps, and jobs.
Data teams rarely build test suites. The cost of setting up a test harness, learning a framework, and maintaining tests alongside evolving schemas was never justifiable against shipping the next feature.
BDD with Gherkin changes the equation:
- Readable by everyone — compliance officers, regulators, and engineers can all read Given/When/Then scenarios
- Agent-friendly — AI agents produce better test coverage from structured Gherkin than from freeform requirements
- Test-production parity — BDD tests call real Unity Catalog functions via the Statement Execution API, the same functions your pipeline uses
For more context, see The test suite nobody had to write: agentic BDD for data pipelines.
| Skill | Description |
|---|---|
bdd-scaffold |
Set up a complete Behave project with Databricks SDK wiring, ephemeral test schemas, and Makefile targets |
bdd-features |
Generate Gherkin feature files from requirements, code, or user stories |
bdd-steps |
Implement Python step definitions that call Databricks SDK / Statement Execution API |
bdd-run |
Execute Behave with tag filtering, parallel execution, JUnit/JSON reporting, and failure diagnosis |
Add to your Claude Code settings (.claude/settings.json):
{
"plugins": [
{
"source": {
"source": "github",
"repo": "dgokeeffe/databricks-bdd-tools",
"ref": "main"
}
}
]
}- Scaffold — Ask Claude Code to "set up BDD for my Databricks project"
- Write features — "write Gherkin tests for my pricing compliance rules"
- Generate steps — "implement step definitions for the new feature file"
- Run — "run BDD smoke tests"
Gherkin Feature Files (.feature)
|
v
Behave Step Definitions (.py)
|
v
call_rule() / Databricks SDK
|
v
Statement Execution API
|
v
Unity Catalog SQL Functions
|
v
Same functions used by production pipeline
The key insight: SQL functions in Unity Catalog are the single source of truth. BDD tests call them via the Statement Execution API. The production pipeline (Lakeflow Spark Declarative Pipelines) calls the same functions in materialized views. No drift.
The test-suite/ directory contains a working proof-of-concept with:
- Feature files for Unity Catalog schema operations and SQL data operations
- Step definitions using
databricks-sdkand Statement Execution API environment.pywith ephemeral schema lifecycle managementbehave.iniconfiguration
Run with:
cd test-suite
uv run behave --tags="@smoke" --format=pretty- Python 3.10+
- uv for package management
databricks-sdkandbehave(installed viauv add --group test behave databricks-sdk)- Authenticated Databricks CLI profile or environment variables
- A SQL warehouse (auto-discovered if not specified)
Apache 2.0