Fix: small fixes for Synapse data source by Niels-b · Pull Request #2608 · sodadata/soda-core

Niels-b · 2026-02-26T15:20:27Z

Description

Note: this builds on the Fabric PR. I've split them up so changes are more clear.

synapse_data_source.py

Added connection parameter to SynapseDataSourceImpl.init (needed for create_copy_with_different_connection)
Added build_create_table_sql override with WITH (HEAP) (Synapse columnstore indexes don't support varchar(MAX))
Removed encode_string_for_sql call in _build_insert_into_values_row_sql (was corrupting newlines)

soda-synapse/.../synapse_data_source_test_helper.py

Added drop_schema_if_exists override (Synapse doesn't support IF EXISTS / CASCADE on DROP SCHEMA)
What problem are you solving?
- Fixing issues surfaced by Synapse development
Any expected impact on downstream packages/services?
- Yes, these issues were surfaced while developing extensions
Reference issue # if available.

Checklist

I added a test to verify the new functionality.
I verified this PR does not break soda-extensions.
- Soda extensions PR: https://github.com/sodadata/soda-extensions/pull/283

for more information, see https://pre-commit.ci

sonarqubecloud · 2026-02-26T15:21:33Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
42.3% Duplication on New Code

See analysis details on SonarQube Cloud

mivds · 2026-02-26T16:31:13Z

soda-synapse/src/soda_synapse/common/data_sources/synapse_data_source.py

+        # Synapse uses clustered columnstore indexes by default, which don't support varchar(MAX).
+        # Use HEAP to create a heap table instead.
+        create_table_sql = create_table_sql + "\nWITH (HEAP)"


Does this have any negative impact with regard to query performance? I guess this is also used to create DWH tables where we store failed rows. Since that's potentially a lot of data, we should ensure query performance is good.

Some input from the friendly neighborhood AI:

When WITH (HEAP) is a good idea: * Staging tables / transient loads (ELT pipelines) * Some small tables / lookup-ish scenarios (though clustered index can also be a good fit) When it’s usually a bad idea * Large fact tables / “main warehouse tables” where you benefit from columnstore compression and scan performance (the default CCI is usually preferred).

For our use case I'm not sure it's a good idea from the query perspective. But then I have no insight on the VARCHAR(MAX) support or alternatives. So, your call

I was doubting this as well. There is a significant trade-off here. After your review on this I checked with the PM: we decided to do it with HEAP now, so we can still support VARCHAR(MAX). The alternative is to use VARCHAR(8000), which is a relatively low limit.

This can still change. We'll include so we can test it, but won't enable it for customers. So we're not locking ourselves in either method at the moment. We'll verify this with CEs as well.

Thank you for pointing this out again 🙌 !

mivds · 2026-02-26T16:31:54Z

soda-synapse/src/soda_synapse/common/data_sources/synapse_data_source.py


    def _build_insert_into_values_row_sql(self, values: VALUES_ROW) -> str:
        values_sql: str = "SELECT " + ", ".join([self.literal(value) for value in values.values])
-        values_sql = self.encode_string_for_sql(values_sql)


You mentioned issues with encoding \n in the PR description, but do we risk introducing different issues by removing this? (Also applies to the Fabric PR)

It only applies to testing code and DWH IIRC.
The issue is tested specifically in DWH tests, where I've seen some issues with the strings containing \n or special quoting or...
Removing it actually solves it.

Niels-b and others added 3 commits February 26, 2026 10:27

Fix: Small updates to Fabric for DWH compatibility

8176c36

[pre-commit.ci] auto fixes from pre-commit.com hooks

7511896

for more information, see https://pre-commit.ci

Fix: small fixes for Synapse data source

9b2b0f2

Niels-b marked this pull request as ready for review February 26, 2026 16:04

Niels-b requested review from m1n0, mivds and paulteehan February 26, 2026 16:05

mivds reviewed Feb 26, 2026

View reviewed changes

mivds approved these changes Feb 26, 2026

View reviewed changes

Base automatically changed from r-472-fabric to main February 27, 2026 08:40

Niels-b merged commit 85602e4 into main Feb 27, 2026
41 checks passed

Niels-b deleted the r-471-synapse branch February 27, 2026 09:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: small fixes for Synapse data source#2608

Fix: small fixes for Synapse data source#2608
Niels-b merged 3 commits intomainfrom
r-471-synapse

Niels-b commented Feb 26, 2026 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Feb 26, 2026

Uh oh!

mivds Feb 26, 2026 •

edited

Loading

Uh oh!

Niels-b Feb 27, 2026 •

edited

Loading

Uh oh!

mivds Feb 26, 2026

Uh oh!

Niels-b Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Niels-b commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

sonarqubecloud bot commented Feb 26, 2026

Quality Gate passed

Uh oh!

mivds Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Niels-b Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mivds Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Niels-b Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Niels-b commented Feb 26, 2026 •

edited

Loading

mivds Feb 26, 2026 •

edited

Loading

Niels-b Feb 27, 2026 •

edited

Loading