Skip to content

feat: Expand structured data types and type casting to all applicable check types#2599

Merged
m1n0 merged 3 commits intomainfrom
dtl-1657-support-structured-data-and-type-casting
Feb 25, 2026
Merged

feat: Expand structured data types and type casting to all applicable check types#2599
m1n0 merged 3 commits intomainfrom
dtl-1657-support-structured-data-and-type-casting

Conversation

@m1n0
Copy link
Contributor

@m1n0 m1n0 commented Feb 24, 2026

Description

Expands the feature to a few more check types. Adds tests.

Black was not present in the list of uv dev packages, so I also added that and updated the lock file.

Accompanying PR for extensions -
Core will need to be bumped in ext after this one is released.

✅ Manual CI run on postgres for these two matching PRs here https://github.com/sodadata/soda-extensions/actions/runs/22398826472

Checklist

  • I added a test to verify the new functionality.
  • I verified this PR does not break soda-extensions.

@m1n0 m1n0 requested a review from Niels-b February 24, 2026 19:33
@sonarqubecloud
Copy link

"polars",
"pandas",
"pyarrow",
"black>=26.1.0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this!

Comment on lines +42 to +50
variables:
id_col_expr:
default: '"id"::varchar'
age_str_col_expr:
default: '"age_str"::integer'
json_col_expr:
default: "json_col::json->>'unique_id'"
born_date_col_expr:
default: '"born_date_str"::DATE'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not expecting any impact on this. Just checking an edge case.
Do we have a test where we test the same "variable" in multiple parts of the contract?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure, we use one mechanism to resolve variables for everything so that should not be a concern 🤔

@m1n0 m1n0 merged commit 5ca68fc into main Feb 25, 2026
41 checks passed
@m1n0 m1n0 deleted the dtl-1657-support-structured-data-and-type-casting branch February 25, 2026 13:51
Copy link
Contributor

@mivds mivds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too late for review, but oh well. Something to think about for refactor maybe 🙂

MaxTimestampMetricImpl(
contract_impl=contract_impl,
check_impl=self,
column=self.column,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems odd. I had expected a column_expression line to replace it, but I only see removal

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, on looking below I see MaxTimestampMetricImpl uses check_impl.column_expression, so it will work... just seems a bit weird design. Why does the metric have a dependency on the check? This seems like another case where the concepts are entangled with a two-way binding between metric and check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is, but I did not introduce it here, checkImp objects are passed into most(or all?) metrics 👎 Fortunately they are not being saved and used further, just data being extracted...I will still have a small follow up on this as I think I should keep the self.column in the metric identity as before, but yeah, no cleanup of the entanglement for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants