Skip to content

Feat!: Separate SQL dialect from data source#2600

Merged
mivds merged 2 commits intomainfrom
DTL-1696/split-dialect-datasource
Feb 25, 2026
Merged

Feat!: Separate SQL dialect from data source#2600
mivds merged 2 commits intomainfrom
DTL-1696/split-dialect-datasource

Conversation

@mivds
Copy link
Contributor

@mivds mivds commented Feb 25, 2026

Description

This PR separates SqlDialect from DataSourceImpl to facilitate testing of the SQL generation without needing access to an actual database.

References to the data source within the SqlDialect have been replaced:

  • sqlglot dialect is now defined at class scope and injected via a class construction argument in __init_subclass__
  • Athena storage location getter is optionally passed at dialect initialization time. If not passed table creation is disabled (will throw an exception when called).

Checklist

  • I added a test to verify the new functionality.
  • I verified this PR does not break soda-extensions.

Warning

This PR breaks extensions due to decoupling the dialect from data source. Suggest releasing this as a minor version bump.

https://github.com/sodadata/soda-extensions/pull/279 updates soda-extensions to work with these changes

@mivds mivds requested review from Niels-b and m1n0 February 25, 2026 11:05
@mivds mivds self-assigned this Feb 25, 2026
@sonarqubecloud
Copy link

Copy link
Contributor

@Niels-b Niels-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for doing these changes!



class SqlServerSqlDialect(SqlDialect):
class SqlServerSqlDialect(SqlDialect, sqlglot_dialect="tsql"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the name tsql? Transact-sql?

NIT: write it out fully

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tsql is the name for dialect SQL Server uses. Not a name I chose, this is what Microsoft named it and how sqlglot also refers to it. This sqlglot_dialect field ends up being passed to sqlglot in sql_utils.apply_sampling_to_sql. We basically had a bug in passing our own naming to sqlglot, but it hasn't been a problem so far as the only database where we used it is snowflake. The names happen to match there, but they don't for other variants.

Proof:

>>> import sqlglot
>>> sqlglot.parse_one("SELECT 1 FROM my_table", read="tsql")
Select(
  expressions=[
    Literal(this=1, is_string=False)],
  from_=From(
    this=Table(
      this=Identifier(this=my_table, quoted=False))))
>>> sqlglot.parse_one("SELECT 1 FROM my_table", read="sqlserver")
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    sqlglot.parse_one("SELECT 1 FROM my_table", read="sqlserver")
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mvds/soda/repos/soda-core/.venv/lib/python3.13/site-packages/sqlglot/__init__.py", line 135, in parse_one
    dialect = Dialect.get_or_raise(read or dialect)
  File "/home/mvds/soda/repos/soda-core/.venv/lib/python3.13/site-packages/sqlglot/dialects/dialect.py", line 1007, in get_or_raise
    suggest_closest_match_and_fail("dialect", dialect_name, all_dialects)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mvds/soda/repos/soda-core/.venv/lib/python3.13/site-packages/sqlglot/helper.py", line 59, in suggest_closest_match_and_fail
    raise ValueError(f"Unknown {kind} '{word}'.{similar}")
ValueError: Unknown dialect 'sqlserver'.

@mivds mivds merged commit 11cc2e9 into main Feb 25, 2026
41 checks passed
@mivds mivds deleted the DTL-1696/split-dialect-datasource branch February 25, 2026 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants