Skip to content

feat: report missing count on duplicate and validity checks#2588

Merged
m1n0 merged 1 commit intomainfrom
dtl-1653-cbre-add-null-count-for-duplicate-and-validity-in-the
Feb 24, 2026
Merged

feat: report missing count on duplicate and validity checks#2588
m1n0 merged 1 commit intomainfrom
dtl-1653-cbre-add-null-count-for-duplicate-and-validity-in-the

Conversation

@m1n0
Copy link
Contributor

@m1n0 m1n0 commented Feb 17, 2026

Description

Please provide a description of your changes:

Adds missing count to validity check, reports it in diagnostics for missing and duplicate count.

Checklist

  • I added a test to verify the new functionality.
  • I verified this PR does not break soda-extensions.

@sonarqubecloud
Copy link

@m1n0
Copy link
Contributor Author

m1n0 commented Feb 17, 2026

@Niels-b another one changing the diagnostics, this time adds a new key. soda/tests/only_postgres/test_all_check_types.py passes fine locally

@m1n0 m1n0 requested review from Niels-b and mivds February 17, 2026 21:17
Copy link
Contributor

@Niels-b Niels-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding new keys will not break the DWH. This just won't be included in the DWH diagnostics. Can always be changed later.
LGTM!

Copy link
Contributor

@mivds mivds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just want to check off next steps:

  • Is the intent to add support for these properties on BE?
  • Do we need to support this in DWH? Would make a conscious decision here and create a ticket if yes, otherwise we'll forget about it until a customer complains about it

"failedRowsPercent": check_result.diagnostic_metric_values.get("invalid_percent"),
"datasetRowsTested": check_result.diagnostic_metric_values.get("dataset_rows_tested"),
"checkRowsTested": check_result.diagnostic_metric_values.get("check_rows_tested"),
"missingCount": check_result.diagnostic_metric_values.get("missing_count"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not supported (yet) by BE from what I see. I think it will get quietly ignored rather than raise errors about unknown fields, but not 100% sure.

Same comment applies for duplicate check below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BE ignores unknown fields, this PR was meant to be a draft, there is a ticket for BE to do follow up and will be first tested and then released next week

@m1n0
Copy link
Contributor Author

m1n0 commented Feb 19, 2026

Yes indeed this was meant to be a draft PR. All context and planning in https://linear.app/sodadata/issue/DTL-1653/cbre-add-null-count-for-duplicate-and-validity-in-the-diagnostic

@m1n0 m1n0 marked this pull request as draft February 19, 2026 08:26
@m1n0 m1n0 marked this pull request as ready for review February 24, 2026 11:19
@m1n0
Copy link
Contributor Author

m1n0 commented Feb 24, 2026

merging after BE and FE now support this. Everything is working as expected

@m1n0 m1n0 merged commit 9c9eda7 into main Feb 24, 2026
41 checks passed
@m1n0 m1n0 deleted the dtl-1653-cbre-add-null-count-for-duplicate-and-validity-in-the branch February 24, 2026 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants