Skip to content

Feat: Add percentage-based sampling for databricks#2597

Merged
mivds merged 8 commits intomainfrom
DTL-1697/databricks-sampling
Feb 25, 2026
Merged

Feat: Add percentage-based sampling for databricks#2597
mivds merged 8 commits intomainfrom
DTL-1697/databricks-sampling

Conversation

@mivds
Copy link
Contributor

@mivds mivds commented Feb 24, 2026

Summary

  • Add TABLESAMPLE (pct PERCENT) support to DatabricksSqlDialect for percentage-based sampling
  • Follow the same pattern as the Postgres TABLESAMPLE BERNOULLI implementation
  • Add unit tests for valid sampling (10%, 25%, 100%) and error cases (ABSOLUTE_LIMIT, out-of-range percentages)

Test plan

  • Unit tests pass: uv run pytest soda-databricks/tests/unit/test_databricks_dialect.py
  • Integration test against Databricks with TEST_DATASOURCE=databricks

🤖 Generated with Claude Code

@mivds mivds changed the base branch from main to DTL-1696/postgres-sampling February 24, 2026 12:18
@mivds mivds self-assigned this Feb 24, 2026
@mivds mivds marked this pull request as ready for review February 24, 2026 12:24
@mivds mivds requested review from Niels-b and m1n0 February 24, 2026 12:24
@mivds mivds changed the title Add percentage-based sampling for Databricks Feat: Add percentage-based sampling for Databricks Feb 24, 2026
@mivds mivds changed the title Feat: Add percentage-based sampling for Databricks Feat: Add percentage-based sampling for databricks Feb 24, 2026
Copy link
Contributor

@Niels-b Niels-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR itself looks good. Left one "big picture" comment. No need to change it now.

@mivds mivds force-pushed the DTL-1696/postgres-sampling branch from 83d66e0 to 2a9da32 Compare February 25, 2026 13:09
Add TABLESAMPLE (pct PERCENT) support to DatabricksSqlDialect, following
the same pattern as the Postgres BERNOULLI implementation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mivds mivds force-pushed the DTL-1697/databricks-sampling branch from f92eea6 to 1bbfb60 Compare February 25, 2026 13:23
@sonarqubecloud
Copy link

Base automatically changed from DTL-1696/postgres-sampling to main February 25, 2026 14:09
@mivds mivds merged commit 8e7ee09 into main Feb 25, 2026
41 checks passed
@mivds mivds deleted the DTL-1697/databricks-sampling branch February 25, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants