Skip to content

opt: inline UDFs as subqueries#92955

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
mgartner:inline-udf-1
Jan 4, 2023
Merged

opt: inline UDFs as subqueries#92955
craig[bot] merged 1 commit intocockroachdb:masterfrom
mgartner:inline-udf-1

Conversation

@mgartner
Copy link
Copy Markdown
Contributor

@mgartner mgartner commented Dec 2, 2022

UDFs are now inlined as subqueries by a normalization rule when possible, speeding up their evaluation.

Epic: CRDB-20370

Release note (performance improvement): Some types of user-defined functions are now inlined in query plans as relation expressions, which speeds up their evaluation. UDFs must be non-volatile and have a single statement in the function body to be inlined.

@mgartner mgartner requested a review from a team as a code owner December 2, 2022 20:50
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice!

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rharding6373)


pkg/sql/opt/norm/inline_funcs.go line 440 at r1 (raw file):

	// argForParam returns the argument that can be substituted for the given
	// column, if the column is a parameter of the UDF. It returns ok=false if
	// the column is not a UDF parameter.

Nice, your work to disambiguate parameters and arguments is paying off here.


pkg/sql/opt/norm/inline_funcs.go line 481 at r1 (raw file):

		&memo.SubqueryPrivate{
			OriginalExpr: nil,
			Ordering:     ordering,

[nit] Is it necessary to propagate an ordering here, since the input is guaranteed to return at most one row? Any required ordering should already be preserved by the limit, right?

[also nit] why explicitly set OriginalExpr here?


pkg/sql/opt/norm/testdata/rules/inline line 1149 at r1 (raw file):

 │         └── "?column?":13 = 100 [outer=(13), constraints=(/13: [/100 - /100]; tight), fd=()-->(13)]
 └── projections
      └── (i:2 + 20) + 10 [as=add:20, outer=(2), immutable]

Why isn't this folded into i + 30? Just don't have a rule for it I guess?

Copy link
Copy Markdown
Collaborator

@rharding6373 rharding6373 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @mgartner)

Copy link
Copy Markdown
Contributor Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball)


pkg/sql/opt/norm/inline_funcs.go line 440 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Nice, your work to disambiguate parameters and arguments is paying off here.

🎉


pkg/sql/opt/norm/inline_funcs.go line 481 at r1 (raw file):
You're right, the Limit maintains the ordering. This might change when we support set-returning UDFs, so I've left some TODOs for now so we don't forget.

[also nit] why explicitly set OriginalExpr here?

Good catch. That was just leftover; I've removed it.


pkg/sql/opt/norm/testdata/rules/inline line 1149 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Why isn't this folded into i + 30? Just don't have a rule for it I guess?

Ya, there's not a rule for it:

defaultdb> EXPLAIN (OPT) SELECT i+10+20 FROM generate_series(1, 5) g(i);
                  info
-----------------------------------------
  project
   ├── project-set
   │    ├── values
   │    │    └── ()
   │    └── zip
   │         └── generate_series(1, 5)
   └── projections
        └── (generate_series + 10) + 20
(8 rows)

add(10, add(i, 200)) = 100 would be normalized (in a project or filter) though because we have NormalizeCmpPlusConstwhich would fire twice.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 12, 2022

Build failed (retrying...):

@yuzefovich
Copy link
Copy Markdown
Member

I think this PR is introducing a flake in TestLogic_udf/volatility.

bors r-

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 12, 2022

Canceled.

@mgartner
Copy link
Copy Markdown
Contributor Author

@yuzefovich where do you see that?

@yuzefovich
Copy link
Copy Markdown
Member

The bors batch failed and then Bazel Essential CI on this PR is red for the same reason on fakedist-disk config, and I haven't seen this flake anywhere else, so I'm assuming it's this PR that is exposing the flake (or introducing it). It could be a coincidence though.

UDFs are now inlined as subqueries by a normalization rule when
possible, speeding up their evaluation.

Release note (performance improvement): Some types of user-defined
functions are now inlined in query plans as relation expressions, which
speeds up their evaluation. UDFs must be non-volatile and have a single
statement in the function body to be inlined.
@mgartner
Copy link
Copy Markdown
Contributor Author

mgartner commented Jan 4, 2023

Fixed the flake.

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 4, 2023

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants