feat: Expand structured data types and type casting to reference check by m1n0 · Pull Request #2602 · sodadata/soda-core

m1n0 · 2026-02-25T16:16:36Z

Description

expand the feature to reference check. Postponed after simpler check types for extra testing and cautious release.

Checklist

I added a test to verify the new functionality.
I verified this PR does not break soda-extensions.

m1n0 · 2026-02-25T16:22:37Z

soda-core/src/soda_core/contracts/impl/check_types/freshness_check.py


    def _get_id_properties(self) -> dict[str, any]:
        id_properties: dict[str, str] = super()._get_id_properties()
-        id_properties["column"] = str(self.column_expression)


adding to identity is handled in super
column name is not necessary - if no expression is provided by user then column_expression is equal to the column name already anyway.

soda-core/src/soda_core/contracts/impl/check_types/freshness_check.py

m1n0 · 2026-02-25T16:24:01Z

soda-core/src/soda_core/contracts/impl/check_types/invalidity_check.py

+        referencing_column_expression: COLUMN | SqlExpressionStr = self.metric_impl.column_expression
+
+        if isinstance(referencing_column_expression, SqlExpressionStr):
+            # find and replace the column name in the column expression with the aliased version


tested on example in comment and on json extraction (that one is in tests) - this may not be bulletproof, but should cover standard use cases

mivds

Some tricky stuff in there with the referencing. Not fully confident reviewing that, but it seems to make sense 🙂

mivds · 2026-02-25T20:54:37Z

soda-core/src/soda_core/contracts/impl/check_types/invalidity_check.py

+
+        full_referencing_column_expression = referencing_column_expression

-        referencing_column_name: str = self.metric_impl.column_impl.column_yaml.name
+        if isinstance(referencing_column_expression, COLUMN):
+            full_referencing_column_expression = referencing_column_expression.IN(self.referencing_alias)


If I understand correctly, can't this be simplified to an else clause?

Suggested change

full_referencing_column_expression = referencing_column_expression

referencing_column_name: str = self.metric_impl.column_impl.column_yaml.name

if isinstance(referencing_column_expression, COLUMN):

full_referencing_column_expression = referencing_column_expression.IN(self.referencing_alias)

else:

referencing_column_expression = referencing_column_expression.IN(self.referencing_alias)

yes it is, and the type cast gives you a hint on what the else covers, but honestly I like the explicit isinstance as it's 100% clear on first glance

mivds · 2026-02-25T20:56:20Z

soda-core/src/soda_core/contracts/impl/check_types/invalidity_check.py

+            aliased_column_name = f'"{self.referencing_alias}".{column_name}'
+            pattern = r"\b" + re.escape(column_name) + r"\b"
+            referencing_column_expression = SqlExpressionStr(
+                re.sub(pattern, aliased_column_name, referencing_column_expression.expression_str, count=1)


Why the count=1 here? Can there never be more than one occurrence? Not entirely clear on what might be inside this referencing_column_expression, so feel free to ignore

I am also not sure what's best, this was suggested by Claude and I am kinda hesitant on the reasoning. Can there be more than one occurrence of the column in the expression? Certainly, but I am not sure what will people attempt to use this for, but normally type casting and struct data extraction should be covered by using the column only once. I kinda like the "replace only once" limitation as we don't want to encourage expressions that are too complex, we have dedicated check types for that.

we had a discussion and decided to allow multi occurrence replacement. I updated the PR, thanks for flagging

sonarqubecloud · 2026-02-26T09:24:52Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

feat: Expand structured data types and type casting to reference check

d33ebad

m1n0 commented Feb 25, 2026

View reviewed changes

soda-core/src/soda_core/contracts/impl/check_types/freshness_check.py Show resolved Hide resolved

m1n0 commented Feb 25, 2026

View reviewed changes

style

a70e58d

m1n0 requested a review from mivds February 25, 2026 16:26

more robust replacement strategy

7757c19

mivds approved these changes Feb 25, 2026

View reviewed changes

allow multi replacement

2098f5e

m1n0 merged commit f65c079 into main Feb 26, 2026
41 checks passed

m1n0 deleted the dtl-1657-support-structured-data-and-type-casting branch February 26, 2026 10:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Expand structured data types and type casting to reference check#2602

feat: Expand structured data types and type casting to reference check#2602
m1n0 merged 4 commits intomainfrom
dtl-1657-support-structured-data-and-type-casting

m1n0 commented Feb 25, 2026 •

edited

Loading

Uh oh!

m1n0 Feb 25, 2026

Uh oh!

Uh oh!

m1n0 Feb 25, 2026

Uh oh!

mivds left a comment

Uh oh!

mivds Feb 25, 2026

Uh oh!

m1n0 Feb 26, 2026

Uh oh!

mivds Feb 25, 2026

Uh oh!

m1n0 Feb 26, 2026

Uh oh!

m1n0 Feb 26, 2026

Uh oh!

sonarqubecloud bot commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

m1n0 commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

m1n0 Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

m1n0 Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

mivds left a comment

Choose a reason for hiding this comment

Uh oh!

mivds Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

m1n0 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

mivds Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

m1n0 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

m1n0 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Feb 26, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

m1n0 commented Feb 25, 2026 •

edited

Loading