Emulating GETDATE() on Azure SQL Database

GETDATE() always uses the UTC time zone on Azure SQL Database which can be a compatibility issue for applications that assume that GETDATE() is using a different time zone. Developers may wish to replace GETDATE() with code that continues to use their expected time zone to avoid UTC time reforms. For example, I’ve seen application code with over 50,000 references to the GETDATE function. Replacing all of those calls with GETUTCDATE() would be a herculean effort.

FAST 1

I don’t know how good Erik’s SEO is, but here’s a simple replacement if you don’t care about performance and just need something that gives you the previous behavior of GETDATE():

CREATE OR ALTER FUNCTION dbo.[GetDateNew]()
RETURNS DATETIME
WITH SCHEMABINDING
AS
BEGIN
    RETURN SYSDATETIMEOFFSET() AT TIME ZONE N'Central Standard Time';
END;
GO

Keep reading if you’d like to do better than that.

AT TIME ZONE At Risk

AT TIME ZONE was a great addition to SQL Server 2016 that allowed for the retirement of many well-meaning but poorly implemented attempts to do time zone conversions in T-SQL. However, it was not without its own problems:

  • The CPU cost per execution was surprisingly high, but this was addressed in SQL Server 2022 and in Azure SQL Database.
  • AT TIME ZONE makes scalar UDFs ineligible for inlining.
  • AT TIME ZONE usage results in an “unknown” cardinality estimate.

You can see the cardinality estimate issue in action by populating a simple table with about 6 million rows.

DROP TABLE IF EXISTS dbo.Level100;

CREATE TABLE dbo.Level100 (
    TenantID INT NOT NULL,
    InsertTime DATETIME NOT NULL
);

INSERT INTO dbo.Level100 (TenantID, InsertTime)
SELECT q.RN % 4, DATEADD(MINUTE, -1 * RN, GETDATE())
FROM
(
    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2
) q;

CREATE INDEX TenantID_InsertTime ON dbo.Level100 (TenantID) INCLUDE (InsertTime);

CREATE INDEX InsertTime_TenantID ON dbo.Level100 (InsertTime) INCLUDE (TenantID);

The first query uses GETDATE():

SELECT COUNT(*)
FROM dbo.Level100 l
WHERE l.TenantID = 2
AND l.InsertTime > DATEADD(MINUTE, -10, GETDATE())
OPTION (MAXDOP 1);

The filter against TenantID is expected to return 25% of the rows in the table and the filter against InsertTime is expected to return less than 0.001% of the rows. As expected, the query optimizer chooses to filter against the InsertTime_TenantID index and the query finishes instantly:

a66 good

The second query uses AT TIME ZONE:

SELECT COUNT(*)
FROM dbo.Level100 l
WHERE l.TenantID = 2
AND l.InsertTime > DATEADD(MINUTE, -10, CAST(SYSDATETIMEOFFSET() AT TIME ZONE N'Central Standard Time' AS DATETIME))
OPTION (MAXDOP 1);

Once again, the filter against TenantID is expected to return 25% of the rows in the table. However, the presence of AT TIME ZONE results in an unknown inequality estimate against InsertTime which is 30%. The query optimizer makes a different choice and goes with the TenantID_InsertTime index because 25% < 30%:

a66 bad

Meme Preparation

It seems that we need to return to the bad old days before we had AT TIME ZONE available in SQL Server. One advantage that we have is that the function only needs to do the time zone conversion for the current moment in time. Central time is six hours behind UTC at the time of publication, so we could construct a function like this:

CREATE OR ALTER FUNCTION dbo.[GetDateCT](@PassInGetDateUTC DATETIME)
RETURNS DATETIME
WITH SCHEMABINDING
AS
BEGIN
    RETURN DATEADD(HOUR, -6, @PassInGetDateUTC);
END;
GO

The input parameter is there to make the function eligible for inlining because GETDATE() and other similar functions prevent inlining. Returning to the same query as before:

SELECT COUNT(*)
FROM dbo.Level100 l
WHERE l.TenantID = 2
AND l.InsertTime > DATEADD(MINUTE, -10, dbo.[GetDateCT](GETUTCDATE()))
OPTION (MAXDOP 1);

We can see that the new function results in a reasonable cardinality estimate and the query optimizer makes the correct index choice:

a66 also good

Of course, this function definition will need to be updated a few times per year to deal with daylight savings time changes. It isn’t too difficult to design a process to automatically perform these updates, but honestly I expect that the hassle of doing something will exceed the tolerance of most companies. However, this option is available to you if you need to preserve the usually superior cardinality estimation behavior enjoyed with GETDATE().

How to Convert Time Zones in SQL Server?

a66 maymay

Parsing and Arsing

Any solution you pick, other than a naked reference to SYSDATETIMEOFFSET() along with AT TIME ZONE, may result in parsing issues in your code. For example, some parts of certain queries do not allow for subqueries, so a subquery replacement against a table-valued function will not work as a direct replacement. Even the simplest scalar UDF reference can result in code that fails to parse, such as the following:

CREATE TABLE #t (
    StupidColumn DATETIME dbo.[GetDateNew]()
);

Fortunately, most of these issues are likely rare in practice and should be straightforward to address.

Keep in mind that you also need to fix GETDATE() references in areas such as default column values, constraints, computed columns, default column tables for table types, and so on. I would personally use a naked reference to SYSDATETIMEOFFSET() along with AT TIME ZONE for these areas as I see no benefit in using a scalar UDF.

Summary of Issues for Various Solutions

I was going to put some introduction text here, but whatever, you can figure out the point of this section on your own.

Inline table-valued function approach

  • Most T-SQL compatibility issues
  • Very verbose
  • Cardinality issues with AT TIME ZONE

Naked SYSDATETIMEOFFSET() AT TIME ZONE N’Central Standard Time’ replacement

  • Very verbose
  • Cardinality issues with AT TIME ZONE

Simple scalar UDF with AT TIME ZONE:

  • UDF is not eligible for inlining
  • UDF will return different values as the query executes which probably isn’t what you want

Non-static scalar UDF:

  • Function definition must occasionally be refreshed
  • Somewhat verbose
  • Some types of queries, such as those with CTEs, will not allow any UDF to be inlined

Final Thoughts

Messy no-win situations like this highlight the importance of picking the right cloud platform for your application. Azure SQL Managed Instance and Azure VMs avoid this problem entirely. Thanks for reading!

A Quirk When Rewriting Scalar UDFs In SQL Server

Back To Business


I think I have probably spent 500 hours of my life rewriting T-SQL Scalar UDFs to avoid all the performance problems associated with them.

The obvious choice is the Inline Table Valued Function, which has fewer performance issues baked in. For the kids out there: they don’t spill trauma.

But getting the rewrite right can be tricky, especially when it’s possible for the function to return NULL values.

I’m going to walk through a simple example, and show you how to get the results you want, without adding abusing your developers.

What is not covered in this post are all the performance issues caused by UDFs. If you want to get into that, click the training link at the bottom of this post.

The Problem


Here’s the function we need to rewrite. It returns a simple bit value if a particular user was active after a certain date:

CREATE OR ALTER FUNCTION
    dbo.rewrite
(
    @UserId int,
    @LastAccessDate datetime
)
RETURNS bit
AS
BEGIN
    DECLARE
        @b bit = 0,
        @d datetime = GETDATE(); /*NOFROID4U*/
    
    SELECT
        @b = 
            CASE
                WHEN u.Id IS NOT NULL
                THEN 1
                ELSE 0
            END
    FROM dbo.Users AS u
    WHERE u.Id = @UserId
    AND   u.LastAccessDate > @LastAccessDate;

    RETURN
        @b;
END;
GO

Since I’m using SQL Server 2022 in compatibility level 160, I’m declaring a useless datetime parameter and using GETDATE() to set it to a value to avoid scalar UDF inlining.

We can call it about like so (again, this query is too trivial to suffer any performance issues), and get some reasonable-looking results back.

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing0 = 
        dbo.rewrite
        (
            p.OwnerUserId, 
            GETDATE()
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;
SQL Server Query Results
who i smoke

Writeable Media


Rewriting this function looks straightforward. All we need to do is Robocop a few parts and pieces and badabing badaboom we’re done.

Note that to really complete this, we’d also need to add a convert to bit to avoid SQL Server implicitly converting the output of the case expression to a (potentially) different datatype, but we’ll fix that in the final rewrite.

CREATE OR ALTER FUNCTION
    dbo.the_rewrite
(
    @UserId int,
    @LastAccessDate datetime
)
RETURNS table
WITH SCHEMABINDING
AS
RETURN    
    SELECT
        b = 
            CASE
                WHEN u.Id IS NOT NULL
                THEN 1
                ELSE 0
            END
    FROM dbo.Users AS u
    WHERE u.Id = @UserId
    AND   u.LastAccessDate > @LastAccessDate;
GO

Of course, this alters how we need to reference the function in the calling query. Inline table valued functions are totally different types of objects from scalar UDFs.

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing1 = 
        (
            SELECT 
                t.b 
            FROM dbo.the_rewrite
            (
                p.OwnerUserId, 
                GETDATE()
            ) AS t
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;

But the results are disappointing! Where we once had perfectly formed zeroes, now we have a bunch of NULLs that severely harsh our mellow.

SQL Server Query Results
torment

This can obviously cause problems for whomever or whatever is ingesting the result set.

  • Expectations: 1 or 0
  • Reality: NULL

Shame, that.

Changing The Query


Many developers will attempt something like this first, to replace NULLs in the calling query:

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing1 = 
        (
            SELECT 
                ISNULL
                (
                    t.b, 
                    0
                ) 
            FROM dbo.the_rewrite
            (
                p.OwnerUserId, 
                GETDATE()
            ) AS t
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;

But this will still produce NULL realities where we have zeroed expectations. We could take a step way back and do something like this:

SELECT TOP (10)
    p.OwnerUserId,
    p.LastActivityDate,
    thing1 = 
        ISNULL
        (
            (
                SELECT 
                    t.b
                FROM dbo.the_rewrite
                (
                    p.OwnerUserId, 
                    GETDATE()
                ) AS t
            ),
            0
        )
FROM dbo.Posts AS p
WHERE p.Score = 1;
GO

But this is an ugly and annoying thing to remember. Imagine having to explain this to someone reading or trying to incorporate our beautiful new function into a query.

We should fix this inside the function.

Fixer Upper


I’m not going to pretend this is the only way to do this. You can likely figure out half a million ways to pet this cat. It’s just easy.

CREATE OR ALTER FUNCTION
    dbo.the_inner_rewrite
(
    @UserId int,
    @LastAccessDate datetime
)
RETURNS table
WITH SCHEMABINDING
AS
RETURN    
    
    SELECT
        b = 
            CONVERT
            (
                bit,
                MAX(x.b)
            )
    FROM
    (
        SELECT
            b = 
                CASE
                    WHEN u.Id IS NOT NULL
                    THEN 1
                    ELSE 0
                END
        FROM dbo.Users AS u
        WHERE u.Id = @UserId
        AND   u.LastAccessDate > @LastAccessDate
        
        UNION ALL
        
        SELECT
            b = 0
    ) AS x;
GO

We have:

  • Our original query, which may return 1 or 0 based on existence
  • A union all to a zero literal so that a result is guaranteed to be produced
  • An outer max to get the higher value between the two inner selects

And this will produce expected results, with the final output converted to a bit.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Tricky Situations When Rewriting Functions In SQL Server

Walk In Line


Far and away, some of the trickiest situations I run into when helping clients is rewriting scalar functions that have WHILE loops in them.

This sort of procedural code is often difficult, but not impossible, to replace with set-based logic.

Sure, lots of IF/THEN/ELSE stuff can be tough too, though that’s often easier to  manage with CASE expressions in stacked CTEs or derived tables.

I ran across a really interesting function recently that I had to rewrite that had a couple WHILE loops in it, and I’ve simplified the example here to show my approach to fixing it.

Table Face


The original intent of the function was to do some string manipulation and return a cleaned version of it.

There were several loops that looked for “illegal” characters, add in formatting characters (like dashes), etc.

The problem the function caused wasn’t it running for a long time (we’ll talk more about that tomorrow), it was that the function was called in really critical code paths that Function Repercussions© were messing with:

  • Row by row execution
  • Inhibited parallelism

These are not the kinds of functions that are Froid Friendly© either. If they were, I could largely leave them alone. Maybe.

Depends on bugs.

Bad Way


The bad way of doing this is like so. If you write functions like this, feel bad. Let it burn a little.

Ten years ago, I’d understand. These days, there’s a billion blog posts about why this is bad.

CREATE OR ALTER FUNCTION
    dbo.CountLetters_Bad
(
    @String varchar(20)
)
RETURNS bigint
AS 
BEGIN

DECLARE
    @CountLetters bigint = 0,
    @Counter int = 0;
    
    WHILE 
        LEN(@String) >= @Counter
    BEGIN
        IF PATINDEX
           (
               '%[^0-9]%', 
               SUBSTRING
               (
                   @String, 
                   LEN(@String) - @Counter, 
                   1
               )
           ) > 0
        BEGIN
            SET @CountLetters += 1;
            SET @Counter += 1;
        END;        
        ELSE        
        BEGIN
            SET @Counter += 1;
        END;
    END;

    RETURN @CountLetters;

END;
GO 


SELECT 
    CountLetters = 
        dbo.CountLetters_Bad('1A1A1A1A1A');

Better Way


This is a better way to write this specific function. It doesn’t come with all the baggage that the other function has.

But the thing is, if you just test them with the example calls at the end, you wouldn’t nearly be able to tell the difference.

CREATE OR ALTER FUNCTION 
   dbo.CountLetters
(   
    @String AS varchar(20)  
)  
RETURNS table
AS
RETURN

WITH
    t AS 
(
    SELECT TOP(LEN(@String))
        *,
        s = 
            SUBSTRING
            (
                @String, 
                n.Number +1, 
                1
            )
    FROM dbo.Numbers AS n
)
    SELECT
        NumLetters = 
            COUNT_BIG(*)
    FROM t
    WHERE PATINDEX('%[^0-9]%', t.s) > 0;
GO

SELECT
    cl.*
FROM dbo.CountLetters('1A1A1A1A1A') AS cl;

Pop Quiz Tomorrow


This is a problem I run into a lot: developers don’t really test SQL code in ways that are realistic to how it’ll be used.

  • Look, this scalar UDF runs fine for a single value
  • Look, this view runs fine on its own
  • Look, this table variable is great when I pass a test value to it

But this is hardly the methodology you should be using, because:

  • You’re gonna stick UDFs all over huge queries
  • You’re gonna join that view to 75,000 other views
  • You’re gonna let users pass real values to table variables that match lots of data

In tomorrow’s post, I’m gonna show you an example of how to better test code that calls functions, and what to look for.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Be Careful Where You Call Inline Table Valued Functions In SQL Server Queries

Keep’em Comin!


While helping someone tame a bunch of rather unfortunate scalar valued functions, we eventually hit a point where they were able to do some of the rewrites themselves. During testing, they ran into a situation where performance got worse when they made the switch over, and it wasn’t because an Eager Index Spool popped up.

I was able to come up with a demo that shows a reasonable enough performance difference in a couple queries using the same technique as I gave them to fix things.

So uh, here goes that.

Muppet


The query they were tuning had a couple OUTER APPLYs already written into it, and so they added the function on as another.

SELECT
    u.DisplayName,
    b.Name,
    nbi.AvgPostsPerDay
FROM dbo.Users AS u
OUTER APPLY
(
    SELECT 
        CreationDate = 
            MIN(p.CreationDate)
    FROM dbo.Posts AS p
    JOIN dbo.Votes AS v
        ON  v.PostId = p.Id
        AND v.VoteTypeId = 1
    WHERE p.OwnerUserId = u.Id
) AS p
OUTER APPLY
(
    SELECT TOP (1)
        b.Name
    FROM dbo.Badges AS b
    WHERE b.UserId = u.Id
    ORDER BY b.Date DESC
) AS b
OUTER APPLY dbo.no_bueno_inline(u.Id, p.CreationDate) AS nbi
WHERE u.Reputation >= 100000
ORDER BY u.Reputation DESC;

Since they didn’t want to lose rows to the function, they couldn’t use CROSS APPLY. Good enough.

Moutarde


But what they really wanted was to move the function up into the select list, like this:

SELECT
    u.DisplayName,
    b.Name,
    AvgPostsPerDay = 
    (
        SELECT 
            nbi.AvgPostsPerDay 
        FROM dbo.no_bueno_inline
        (
            u.Id, 
            p.CreationDate
        ) AS nbi
    )
FROM dbo.Users AS u
OUTER APPLY
(
    SELECT 
        CreationDate = 
            MIN(p.CreationDate)
    FROM dbo.Posts AS p
    JOIN dbo.Votes AS v
        ON  v.PostId = p.Id
        AND v.VoteTypeId = 1
    WHERE p.OwnerUserId = u.Id
) AS p
OUTER APPLY
(
    SELECT TOP (1)
        b.Name
    FROM dbo.Badges AS b
    WHERE b.UserId = u.Id
    ORDER BY b.Date DESC
) AS b
WHERE u.Reputation >= 100000
ORDER BY u.Reputation DESC;

That way you don’t lose any rows like you could with CROSS APPLY, and the optimizer is more likely to holler at the function later on in the query plan, since the values from it are only being projected — that’s fancy for selected.

Mapperoo


The full query plan is a bit much to untangle quickly in this post, but the timing difference is noticeable enough for my purposes:

SQL Server Query Plan
tootin

So if you ever end up rewriting a scalar valued function as an inline table valued function, make sure you test calling it in the same way. Moving query syntax around may produce logically equivalent results, but won’t always produce equivalent performance.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

UDF Inlining And String Aggregations In SQL Server 2019

Quatro Enemigos


This post is really four different posts. Maybe five. The main points are:

Let’s start at the top, because tops are top.

Strung Out


If you’re like me and you got excited by the induction of STRING_AGG into the T-SQL Lexicon because of all the code odd-balling it would replace, you were likely also promptly disappointed for a few reasons.

  • Needing to convert the element to be AGGed to a MAX to avoid errors
    • STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation
  • The WITHIN GROUP ordering is clunky compared to an outer ORDER BY (but hey, logical query processing…)
  • No support for DISTINCT in the function, and an outer DISTINCT tries to make the wrong thing DISTINCT (see above)

And of course, it’s a breaking limitation for UDF inlining.

The UDF does not reference the STRING_AGG function

Let’s look at all of that stuff at once, in one function.

CREATE OR ALTER FUNCTION 
    dbo.IsStringAgg (@UserId int)
RETURNS 
    nvarchar(max)
WITH 
    RETURNS NULL ON NULL INPUT, 
    SCHEMABINDING
AS
BEGIN
RETURN
(
    SELECT
        STRING_AGG
        (
            CONVERT
            (
                nvarchar(MAX),
                b2.Name
            ), 
            N', '
        )
        WITHIN GROUP 
        (
            ORDER BY 
                b2.Name
        )
    FROM
    (
        SELECT DISTINCT 
            b.Name
        FROM dbo.Badges AS b
        WHERE b.UserId = @UserId
    ) AS b2
);
END;
GO

Not exactly  a thing of beauty, is it? Let’s hold onto that for one second, though.

XML > JSON


Okay, so let’s get back to that UDF documentation.

The UDF does not reference XML methods

That sort of reads like NOXML4U, which is my friend Jeremiah’s license plate. In reality though, it means all the .method thingies, like value, node, query, etc.

So if you were to try to follow my prior advice on string concatenation, the function couldn’t be inlined.

SELECT
   x = 
   (
       SELECT 
           [text()] = 
               b.Name
       FROM dbo.Badges AS b
       WHERE b.Id = 100564
       FOR XML 
           PATH(''),
           TYPE
   ).value
     (
         './text()[1]', 
         'nvarchar(max)'
     );

Having the .value breaks inlining. So there’s that. I believe this is where Reddit users post something like le sigh with some poorly drawn stick figure.

Del The Funktion


We can write the function semi-correctly like so:

CREATE OR ALTER FUNCTION 
    dbo.NotStringAgg (@UserId int)
RETURNS 
    nvarchar(max)
WITH 
    RETURNS NULL ON NULL INPUT, 
    SCHEMABINDING
AS
BEGIN
RETURN
    STUFF
    (
        (
            SELECT
                N', ' + 
                b.Name
            FROM dbo.Badges AS b
            WHERE b.UserId = @UserId
            GROUP BY b.Name
            ORDER BY b.Name
            FOR XML PATH (N'')
        ), 
        1,
        2,
        N''
    );
END;
GO

Alright, now let’s BRING THAT CODE BACK (air horn)

Sea Vessel Soliloquy


If we compare the execution plans for these two functions, the XML one gets inlined, and the STRING_AGG one does not.

SELECT TOP (5000)
    u.DisplayName,
    ThanksForAllYourHardWork = 
        dbo.NotStringAgg(u.Id)
FROM dbo.Users AS u
ORDER BY u.Reputation DESC;
GO 

SELECT TOP (5000)
    u.DisplayName,
    ThanksForAllYourHardWork = 
        dbo.IsStringAgg(u.Id)
FROM dbo.Users AS u
ORDER BY u.Reputation DESC;
GO

Here’s the plan for the inlined function:

SQL Server Query Plan
blaze it

Here’s the plan for the non-inlined function:

SQL Server Query Plan
thin ice

Stay Thirsty


The inlined function finishes about twice as fast, though one may pause for a moment to consider whether the 400ms difference is an earth shattering kaboom in this case.

Of course, the real kicker is when scalar UDFs are invoked as part of larger queries where parallelism, etc. is important.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

The Academics Behind FROID

I am a heading


If you’re curious about all the stuff behind FROID, I’d highly suggest checking out this video by Andy Pavlo (b|t). It’s a great talk based on the academic material written about it, and made quite approachable and understandable even if you don’t understand all those fancy symbols that math people tell me are really numbers.

 

If you like this video, I’d highly suggest following Andy’s classes on YouTube. While they’re pretty database agnostic, they’re still awesome to watch to get a better understanding of how and why databases work, and do what they do. And there are some hot beats, to boot.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Rewriting Scalar UDFs For SQL Server 2019 Inlining

You May Find Yourself


SQL Server 2019 has been a bit of a roller coaster ride. In particular, UDF inlining started as I think the most interesting addition to the product. Big brain stuff, for sure.

It has been nerfed quite a bit, with seemingly more and more restrictions added to every cumulative update. Hopefully some of these can be lifted at the feature matures, but I understand how difficult all this is.

People program absolute bloodbaths into functions.

Today, I want to look at one restriction that has a fairly simple workaround: Calling GETDATE().

Emptied


Let’s use a simple function that figures out if fewer than 90 days exist between two dates.

CREATE OR ALTER FUNCTION dbo.sneaky
(
    @StartDate datetime,
    @EndDate datetime
)
RETURNS bit
WITH
    SCHEMABINDING,
    RETURNS NULL ON NULL INPUT
AS
BEGIN
RETURN 
    CASE 
        WHEN DATEDIFF
             (
                 DAY,
                 @StartDate,
                 ISNULL(@EndDate, GETDATE())
             ) < 90
       THEN 1
       ELSE 0 
    END;
END;
GO

This function can’t be inlined, because we call GETDATE() inside the function body. We can witness all the baddities that scalar UDFs cause as usual.

SELECT
    u.DisplayName,
    sneaky = 
        dbo.sneaky(u.CreationDate, u.LastAccessDate)
FROM dbo.Users AS u
WHERE EXISTS
      (
          SELECT
              1/0
          FROM dbo.Posts AS p
          JOIN dbo.Votes AS v
              ON v.PostId = p.Id
          WHERE p.OwnerUserId = u.Id
      );

The query won’t be allowed to use parallelism, the function will execute once per row it needs to process, etc. etc. etc.

SQL Server Query Plan
quarters

Cool Street


The better option is to use a third function argument that you can pass GETDATE() to.

CREATE OR ALTER FUNCTION dbo.sneakier
(
    @StartDate datetime,
    @EndDate datetime,
    @FallBack datetime
)
RETURNS bit
WITH
    SCHEMABINDING,
    RETURNS NULL ON NULL INPUT
AS
BEGIN
RETURN 
    CASE 
        WHEN DATEDIFF
             (
                 DAY,
                  @StartDate,
                  ISNULL(@EndDate, @FallBack)
             ) < 90
       THEN 1
       ELSE 0 
    END;
END;
GO

You could also change the calling query to protect from NULLs and remove the check from the function, like so:

dbo.sneaky(u.CreationDate, ISNULL(u.LastAccessDate, GETDATE()))

Calling the new function like so results in a much faster query execution time:

SELECT
    u.DisplayName,
    sneaky = 
        dbo.sneakier(u.CreationDate, u.LastAccessDate, GETDATE())
FROM dbo.Users AS u
WHERE EXISTS
      (
          SELECT
              1/0
          FROM dbo.Posts AS p
          JOIN dbo.Votes AS v
              ON v.PostId = p.Id
          WHERE p.OwnerUserId = u.Id
      );
GO
SQL Server Query Plan
on my head?

The plan is allowed to go parallel, and rather than the function being hidden in a Compute Scalar operator, it’s represented in the query plan by Constant Scan operators that produce the necessary rows.

SQL Server Query Plan
too many horses

Pokemon Drift


These types of rewrites will probably become more common as people move to newer versions of SQL Server, and embrace higher compatibility levels where these features are allowed to maneuver.

Of course, at the rate things change, that may be a long ways off.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Common Table Expressions Are Useful For Rewriting Scalar Functions In SQL Server

Punching



Thanks for watching!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

A Bug With Recursive UDFs When Inlined In SQL Server 2019

Enough Already


I see people do things like this fairly often with UDFs. I don’t know why. It’s almost like they read a list of best practices and decided the opposite was better.

This is a quite simplified function, but it’s enough to show the bug behavior.

While writing this, I learned that you can’t create a recursive (self-referencing) scalar UDF with the schemabinding option. I don’t know why that is either.

Please note that this behavior has been reported to Microsoft and will be fixed in a future update, though I’m not sure which one.

Swallowing Flies


Let’s take this thing. Let’s take this thing and throw it directly in the trash where it belongs.

CREATE OR ALTER FUNCTION dbo.how_high
(
    @i int,
    @h int
)
RETURNS int
WITH
    RETURNS NULL ON NULL INPUT
AS
BEGIN

    SELECT 
       @i += 1;
    
    IF @i < @h
    BEGIN
        SET 
            @i = dbo.how_high(@i, @h);
    END;

    RETURN @i;

END;
GO

Seriously. You’re asking for a bad time. Don’t do things like this.

Unless you want to pay me to fix them later.

Froided


In SQL Server 2019, under compatibility level 150, this is what the behavior looks like currently:

/*
Works
*/
SELECT 
    dbo.how_high(0, 36) AS how_high;
GO 

/*
Fails
*/
SELECT 
    dbo.how_high(0, 37) AS how_high;
GO

The first execution returns 36 as the final result, and the second query fails with this message:

Msg 217, Level 16, State 1, Line 40
Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32).

A bit odd that it took 37 loops to exceed the nesting limit of 32.

This is the bug.

Olded


With UDF inlining disabled, a more obvious number of loops is necessary to encounter the error.

/*
Works
*/
SELECT 
    dbo.how_high(0, 32) AS how_high
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));
GO 

/*
Fails
*/
SELECT 
    dbo.how_high(0, 33) AS how_high
OPTION (USE HINT('DISABLE_TSQL_SCALAR_UDF_INLINING'));
GO

The first run returns 32, and the second run errors out with the same error message as above.

Does It Matter?


It’s a bit hard to imagine someone relying on that behavior, but I found it interesting enough to ask some of the nice folks at Microsoft about, and they confirmed that it shouldn’t happen. Again, it’ll get fixed, but I’m not sure when.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

When Does Scalar UDF Inlining Work In SQL Server?

The Eye


UPDATE: After writing this and finding the results fishy, I reported the behavior described below in “Somewhat Surprising” and “Reciprocal?” and it was confirmed a defect in SQL Server 2019 CU8, though I haven’t tested earlier CUs to see how far back it goes. If you’re experiencing this behavior, you’ll have to disable UDF inlining in another way, until CU releases resume in the New Year.

With SQL Server 2019, UDF inlining promises to, as best it can, inline all those awful scalar UDFs that have been haunting your database for ages and making queries perform terribly.

But on top of the long list of restrictions, there are a number of other things that might inhibit it from kicking in.

For example, there’s a database scoped configuration:

ALTER DATABASE SCOPED CONFIGURATION SET TSQL_SCALAR_UDF_INLINING = ON/OFF; --Toggle this

SELECT 
    dsc.*
FROM sys.database_scoped_configurations AS dsc
WHERE dsc.name = N'TSQL_SCALAR_UDF_INLINING';

There’s a function characteristic you can use to turn them off:

CREATE OR ALTER FUNCTION dbo.whatever()
RETURNS something
WITH INLINE = ON/OFF --Toggle this
GO

And your function may or not even be eligible:

SELECT 
    OBJECT_NAME(sm.object_id) AS object_name,
    sm.is_inlineable
FROM sys.sql_modules AS sm
JOIN sys.all_objects AS ao
    ON sm.object_id = ao.object_id
WHERE ao.type = 'FN';

Somewhat Surprising


One thing that caught me off guard was that having the database in compatibility level 140, but running the query in compatibility level 150 also nixed the dickens out of it.

DBCC FREEPROCCACHE;
GO 

ALTER DATABASE StackOverflow2013 SET COMPATIBILITY_LEVEL = 140;
GO 

WITH Comments AS 
(
    SELECT
        dbo.serializer(1) AS udf, --a function
        ROW_NUMBER() 
            OVER(ORDER BY 
                     c.CreationDate) AS n
    FROM dbo.Comments AS c
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n BETWEEN 1 AND 100
OPTION(USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_150'), MAXDOP 8);
GO

Our query has all the hallmarks of one that has been inflicted with functions:

SQL Server Query Plan
it can’t go parallel

And if you’re on SQL Server 2016+, you can see that it executes once per row:

SELECT 
    OBJECT_NAME(defs.object_id) AS object_name,
    defs.execution_count,
    defs.total_worker_time,
    defs.total_physical_reads,
    defs.total_logical_writes,
    defs.total_logical_reads,
    defs.total_elapsed_time
FROM sys.dm_exec_function_stats AS defs;
SQL Server Query Plan
rockin’ around

Reciprocal?


There’s an odd contradiction here, though. If we repeat the experiment setting the database compatibility level to 150, but running the query in compatibility level 140, the function is inlined.

DBCC FREEPROCCACHE;
GO 

ALTER DATABASE StackOverflow2013 SET COMPATIBILITY_LEVEL = 150;
GO 

WITH Comments AS 
(
    SELECT
        dbo.serializer(c.Id) AS udf,
        ROW_NUMBER() 
            OVER(ORDER BY 
                     c.CreationDate) AS n
    FROM dbo.Comments AS c
)
SELECT 
    c.*
FROM Comments AS c
WHERE c.n BETWEEN 1 AND 100
OPTION(USE HINT('QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_140'), MAXDOP 8);
GO

Rather than seeing a non-parallel plan, and non-parallel plan reason, we see a parallel plan, and an attribute telling us that a UDF has been inlined.

SQL Server Query Plan
call hope

And if we re-check the dm_exec_function_stats DMV, it will have no entries. That seems more than a little bit weird to me, but hey.

I’m just a lowly consultant on SSMS 18.6

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.