Skip to content

opt: refactor and test the JoinMultiplicity library#50133

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
DrewKimball:refactor_multiplicity
Jun 16, 2020
Merged

opt: refactor and test the JoinMultiplicity library#50133
craig[bot] merged 1 commit intocockroachdb:masterfrom
DrewKimball:refactor_multiplicity

Conversation

@DrewKimball
Copy link
Copy Markdown
Collaborator

Previously, the logic in the JoinMultiplicity library was complicated
and untested. Multiple bugs have been both found and introduced
during refactoring of the original JoinFiltersMatchAllLeftRows logic.

This patch refactors the code to make it easier to understand and
introduces unit testing for the library.
This patch also makes the following changes to the behavior of
multiplicity_builder:

  1. Fixed an issue that caused filtersMatchAllLeftRows to return false
    positives.
  2. Foreign key columns only invalidate a foreign key if they were
    nullable in the base table. Since we derive the foreign key
    from a left not-null column in the first place, we know that
    columns from this table haven't been null-extended, so a not-null
    column will stay not-null. This is useful when foreign key
    columns are not in the output set.
  3. All columns from a table (including non-output columns) are
    added to the UnfilteredCols field. This allows foreign key
    relations to be validated even when the foreign key columns are
    not in the output set.

Fixes #50059

Release note: None

@DrewKimball DrewKimball requested a review from a team as a code owner June 12, 2020 05:42
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@DrewKimball DrewKimball force-pushed the refactor_multiplicity branch from 6cc1a4c to 5c2ceee Compare June 12, 2020 07:00
Copy link
Copy Markdown
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:, I've looked this one over and my feedback is incorporated.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Very nice!

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/sql/opt/memo/multiplicity_builder.go, line 507 at r1 (raw file):

// getTableIDsFromCols returns all unique TableIDs from the given ColSet.
func getTableIDsFromCols(md *opt.Metadata, cols opt.ColSet) (tables []opt.TableID) {

[nit] add "Columns that don't come from a base table are ignored".


pkg/sql/opt/memo/multiplicity_builder_test.go, line 31 at r1 (raw file):

	// CREATE TABLE fk_tab (
	// 		r1 INT NOT NULL REFERENCES xy(x),
	//		r2 INT REFERENCES xy(x),

[nit] indentation


pkg/sql/opt/memo/multiplicity_builder_test.go, line 35 at r1 (raw file):

	// );
	// CREATE TABLE abc (
	//		a INT, b INT, c INT,

[nit] inconsistent indendation


pkg/sql/opt/memo/multiplicity_builder_test.go, line 58 at r1 (raw file):

	oneNullMultiColFKScan, oneNullMultiColFKCols := ob.oneNullMultiColFKScan()

	testCases := []struct {

[nit] These tests would have been candidates for using exprgen which allows you to specify exact opt trees in regular tests, e.g. https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/memo/testdata/logprops/join#L726

I am not suggesting rewriting these tests at this point, just mentioning it so you're aware it exists.


pkg/sql/opt/memo/multiplicity_builder_test.go, line 291 at r1 (raw file):

var createTableStatements = []string{
	"CREATE TABLE xy (x INT PRIMARY KEY, y INT);",

[nit] you could define all these in a single string (and use backquotes) and put it inside or above the test (and pass it to makeOpBuilder/createTables). Then it's all in one place. You can use parser.SplitFirstStatement to split off one statement at a time.

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/sql/opt/memo/multiplicity_builder.go, line 298 at r1 (raw file):

// 5. All left columns come from a single table, and all right columns come from
//    a single table.
func verifyFilters(left, right RelExpr, filters FiltersExpr) bool {

[nit] The name of this helper function should probably be a bit more descriptive -- verifyFilters is a bit too generic

Copy link
Copy Markdown
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)


pkg/sql/opt/memo/multiplicity_builder.go, line 298 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

[nit] The name of this helper function should probably be a bit more descriptive -- verifyFilters is a bit too generic

Done.


pkg/sql/opt/memo/multiplicity_builder.go, line 507 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] add "Columns that don't come from a base table are ignored".

Done.


pkg/sql/opt/memo/multiplicity_builder_test.go, line 31 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] indentation

Done.


pkg/sql/opt/memo/multiplicity_builder_test.go, line 35 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] inconsistent indendation

Done.


pkg/sql/opt/memo/multiplicity_builder_test.go, line 58 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] These tests would have been candidates for using exprgen which allows you to specify exact opt trees in regular tests, e.g. https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/opt/memo/testdata/logprops/join#L726

I am not suggesting rewriting these tests at this point, just mentioning it so you're aware it exists.

Ah, wish I'd though of that.


pkg/sql/opt/memo/multiplicity_builder_test.go, line 291 at r1 (raw file):

Previously, RaduBerinde wrote…

[nit] you could define all these in a single string (and use backquotes) and put it inside or above the test (and pass it to makeOpBuilder/createTables). Then it's all in one place. You can use parser.SplitFirstStatement to split off one statement at a time.

Done.

Previously, the logic in the JoinMultiplicity library was complicated
and untested. Multiple bugs have been both found and introduced
during refactoring of the original JoinFiltersMatchAllLeftRows logic.

This patch refactors the code to make it easier to understand and
introduces unit testing for the library.
This patch also makes the following changes to the behavior of
multiplicity_builder:
1. Fixed an issue that caused filtersMatchAllLeftRows to return false
   positives.
2. Foreign key columns only invalidate a foreign key if they were
   nullable in the base table. Since we derive the foreign key
   from a left not-null column in the first place, we know that
   columns from this table haven't been null-extended, so a not-null
   column will stay not-null. This is useful when foreign key
   columns are not in the output set.
3. All columns from a table (including non-output columns) are
   added to the UnfilteredCols field. This allows foreign key
   relations to be validated even when the foreign key columns are
   not in the output set.

Fixes cockroachdb#50059

Release note: None
@DrewKimball DrewKimball force-pushed the refactor_multiplicity branch from 5c2ceee to 5564a19 Compare June 15, 2020 19:06
Copy link
Copy Markdown
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @RaduBerinde)


pkg/sql/opt/memo/multiplicity_builder_test.go, line 58 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Ah, wish I'd though of that.

I knew about it, but my main concern with using our existing frameworks was not lack of control, but that it tends to be verbose. Here, each extra case is < 10 lines; with our existing frameworks they're probably 50-100 lines each. For something as tricky as FK multiplicities, I think we need tests that zero right in on exactly what we're trying to test (singe multiplicity property), and no more. Maybe we needed a new prop name=multiplicity command where you can specify the property(ies) you want to output, and all else is elided.

@DrewKimball
Copy link
Copy Markdown
Collaborator Author

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 16, 2020

Build succeeded

@craig craig bot merged commit 96feb13 into cockroachdb:master Jun 16, 2020
Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andy-kimball, @DrewKimball, and @RaduBerinde)


pkg/sql/opt/memo/multiplicity_builder_test.go, line 58 at r1 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

I knew about it, but my main concern with using our existing frameworks was not lack of control, but that tend to be verbose. Here, each extra case is < 10 lines; with our existing frameworks they're probably 50-100 lines each. For something as tricky as FK multiplicities, I think we need tests that zero right in on exactly what we're trying to test (singe multiplicity property), and no more. Maybe we needed a new prop name=multiplicity command where you can specify the property(ies) you want to output, and all else is elided.

As a newcomer to the codebase, I agree. I'm not always sure the original intent of test cases when there's a lot of "set up" for a very specific case. I'm a fan of commenting on these types of tests to make it clear in prose what the intent of the test is. I think this helps prevent future changes from changing the test in such a way that it no longer covers the originally intended case.

@DrewKimball DrewKimball deleted the refactor_multiplicity branch June 18, 2020 04:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opt: filtersMatchAllLeftRows returns false positives

6 participants