Free SQL Server Performance Monitoring: Lite Dashboard Overview

Free SQL Server Performance Monitoring: Lite Dashboard Overview


Summary

In this video, I delve into the exciting world of SQL Server monitoring tools, focusing on the Lite version of my free open-source tool, which is a lightweight and secure solution for monitoring performance across multiple servers. I walk through how to download and set up the Lite edition, highlighting its unique features such as no server-side installations or databases, making it an ideal choice for environments like Azure SQL Database where agent jobs are used. The video covers the tool’s 20 collectors that run with minimal permissions, ensuring data collection is both efficient and secure. I also showcase the various tabs within the dashboard, including weight stats, query trends, CPU and memory usage graphs, blocking information, and performance monitor counters, all designed to provide a comprehensive view of SQL Server health without any heavy lifting on the server side.

Chapters

Full Transcript

Erik Darling here with Darling Data. And boy, I just keep getting more and more excited about this here monitoring tool journey that we’re on together. In case you have missed all of the other stuff around this, I released a free open source SQL Server monitoring tool that makes it very easy for you to get up started monitoring performance on your SQL servers. The Lite version, which is what we’re going to talk about, today is like my second child, you know, the full dashboard that I like originally started working on, had some limitations to it and also presented some things about it that would probably cause some friction in people using it. So the main limitation is not being able to support Azure SQL database because it creates a database and logs data to it using agent jobs. If that’s too much for you, that’s fine. This is what the Lite version is for. So that was like sort of the friction point in and the missing little bit of supportability in there. But the Lite version, if you head to code.erikdarling.com, I’ll show you that URL in a moment. All you have to do is download the zip for it, extract it to single executable with an embedded DuckDB database. I am an unabashed DuckDB fanboy. I think it’s about one of the coolest things to happen in databases in a long time. And so it just starts running a bunch of data collection, pulling into the DuckDB database and then showing it to you in the graphs. So you just extract it, point it to a server, hit connect. And if you trust SQL Server Management Studio or Azure SQL DevOps Operations Manager, whatever, that’s getting retired this month to connect to SQL Server, then you can trust my tools to connect to SQL Server, because they all use Windows Credential Manager. They do not store passwords and usernames and files or anywhere stupid. It’s all stored in Windows. So there’s nothing dumb happening, right? This is a very, very big deal to me. This is not some like half butted effort anyway. But then, you know, so try not to mess anything up from a security point of view. The Lite dashboard requires very minimal permissions. It just really needs like view server performance state, or if that’s not available, view server state. A max of seven servers will collect at one time. So if you have more than seven servers that you’re monitoring with it, seven will go get their stuff and the next seven will go get their stuff.

And they all have a timeout on them. Now, like I just redid the data collection for this. So that if you have servers that are timing out, they will not hinder collectors that are able to run. Before I had done something kind of stupid, and they would keep trying the servers, even though like some of the is this server online queries were failing. So that’s fixed now. So got less stupid. But the cool thing about the Lite edition, which is also sort of one of the drawbacks of it is there’s nothing installed on the server, there’s no database, no agent jobs. But that means if the app is not open and running, it’s not actively collecting data, right? So if you close the app, and you go away for two weeks, and you come back, and you’re like, what’s wrong with SQL Server, there’s not going to be anything in there for you do have to have it running somewhere to collect the stuff. But we’re going to talk through what the what it looks like, kind of how it works, the different things that it shows. And these are the 20 collectors that it runs. So it’s a lot of stuff. And we run the run these as much as possible. So that I make sure that you have the most up to date data. This is all very lightweight stuff. I’ll show you that I’ll show you that in a moment. And I’ll show you a couple areas where I’m working on improvements to. So again, if you want to check this out, I’ll show you a little bit more now. So this thing is currently running and collecting data. I started off around about an hour or so ago, and I spun up my hammer DB workload to make sure that we had some stuff to look at, and it wasn’t all boring. So this is the weight stats, and the weight stats will show you the top 20 weights. Really, it’s the top 10 weights, along with like the sort of the usual suspect weights, SOS scheduler yield, the CX weights, page latch weights, stuff like that, along with the poison weights to you see that over in this this part here, like I have thread pool, and I have like some other stuff that you would like normally see when things are like bad on a server. But like, like, I want to make sure that we represent like what’s actually happening on the server and also call out some of the scarier stuff if that’s, if that’s, if that’s happening to sometimes it’s good to see when that’s not happening. Over in the queries tab, we have some performance trends, right? So these are graphs that tell you like what was going on in a server, the query store one looks a little flat, because I didn’t turn on query store in the TPC databases that the workloads running in. So that one looks a little dull. But all the rest of these are very interesting. This is query executions, that’s procedure duration, that’s query duration. And over here, you can see queries that are actively running over here. And one thing that’s like you can, you can sort this data like any way you want, right? So like if you wanted to sort by like, I don’t know, like CPU, right, you can change the way that the data is sorted and get stuff that has the highest CPU.

And you can also filter. So if you wanted to get rid of stuff that like is greater than zero CPU, you can do that too. You have queries by duration here, that’s the default view. But you know, you can switch all these to order by whatever other thing you want. You can filter all these, these are store procedures. And this is all the stuff from query store. Now what’s neat about all of the query grids is something that you can do if you want to troubleshoot a slow query. You can right click on it. And you can hit copy repro script. And then you can go over to SSMS and paste it in. Now, this isn’t the greatest example, because there are some local variables in here that like we don’t have the data types and stuff of, but we do have the execution parameter that got used in here for the where clause, right? So like the DW, whatever ID, we would just have to figure out like, okay, what’s the data type is just run this query, right? So it does warn you about that up at the top too, right? So like, but other than that, it’ll give you everything you need to reproduce a query just copying and pasting it.

Now, coming back to the dashboard. Oh, wait, is that the right one? Yeah, it’s the right one. Okay, good. Not losing my mind. We also have some individual graphs for things like CPU memory usage. There are lines here. There’s also some information up at the top about like plan cache buffer pool, other things like that. Um, you know, like how much physical memory the SQL Server has, how much is currently using, um, you know, this, you know, like some nice stuff to sort of just see visually and not just have to like, oh, what is this thing? Um, one thing I actually forgot to mention is you can hover over any graph line you see and get how like the, the, uh, data, the data point from down to the legend that it represents and like some other stuff about it. Right. So, uh, we got information about 10 dB. This is like a current, like space usage, like what’s using a lot of space in there. So you can see if it’s user objects, internal objects or version store. And then you can, uh, down here, we have 10 dB, uh, file latency. So you can see exactly which files are spiking up when, right? And every once in a while you get lucky and there’s a spike and like what was using stuff and you know, what was, what was going on in there. Um, over in the blocking tab, uh, there’s not nothing going on for blocking right now, but we do.

I do have deadlocked. In fact, the deadlock alert just popped up and made my green screen angry. Uh, but we have some trends over here, which will tell us like, like about deadlocks that are happening and, uh, blocking happening. And then, you know, you, you get like the, the fully parsed out deadlock graph over here. So I can just make, put all much information in front of you at one point as you can, um, over here, we have perfmon counters. So there’s by default, I show batch requests, optimizations, compilations, and recompilations.

So you like, that’s usually the stuff that I eyeball first on a server. So, but you can add in, you can click in any one of these and add a new line to it. Like we just added in lock, wait time. And because there’s a bunch of deadlocking and stuff on here, uh, we get a whole new spiky thing going on in the graph. Um, we also look at running agent jobs. Uh, this one is actually seeing the performance, the full performance monitor database agent job running and doing stuff.

So we’ve got that going on, uh, in the configuration tab, you can see if anyone has changed trace flags, um, you get, uh, uh, database scope configurations, um, like, uh, database configuration stuff from sys.databases. This needs some more columns in it. I’m going to make a note about that. And then the general server configuration as well. Um, and then over here we see collection health where we can see the, like a summary of things that are running, how long they take on average, uh, a full log of everything.

So if you wanted to like, if you saw something like had, um, you saw that something had some number of errors in this column, you could come over here and see when it, when, when those errors happened. Uh, you could just filter down to the error message. And then over here are duration trends. So, um, you know, because I care about performance, um, like I am working on some of these, uh, sometimes they take a little bit longer, especially like the XML parsing ones. You have a lot of deadlocks or a lot of blocking or like, um, anything else that deals with XML and SQL Server. Sometimes it takes a little bit longer, especially when the server is under a lot of load from a hammer DB test.

So, uh, I clearly have some work to do, which means I’m going to end this video here and go figure out which one of these jobs I need to go have a talk to. Anyway, uh, thank you for watching. I hope you enjoyed yourselves. I hope you learned something. Uh, again, this is all available for free at code.erikdarling.com. Uh, you should go try it out, start monitoring the performance of your SQL servers and have a good time. All right, cool. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: How To Talk To Your Performance Data With The MCP Server

Free SQL Server Performance Monitoring: How To Talk To Your Performance Data With The MCP Server


Summary

In this video, I delve into the built-in MCP server feature of my FreeSQL server monitoring tool, Darling Data. This feature allows users to interact with their collected monitoring data in a conversational manner using language models like Claude or other compatible tools. The MCP server is designed for local use only and does not expose any data beyond what’s available within the performance monitor database, ensuring that no sensitive information is at risk. By enabling this feature, you can ask questions about your server’s performance directly to an LLM, receiving quick insights into issues such as deadlocks, query performance, and workload spikes. This approach simplifies the process of diagnosing problems without requiring extensive manual analysis or script writing.

Chapters

Full Transcript

Erik Darling here with Darling Data. And in this video, I want to go over a little bit about using the built-in MCP server. That is a model context protocol server available in my FreeSQL server monitoring tool that allows you to have a little bit of a chat with your collected monitoring data. It’s not allowed to go out and start running random ad hoc DMV queries. It’s not allowed to, you know, you know, drop, delete, update, mess up, merge, merge anything. It’s not, it’s very tightly bound to the tools that it has defined and available to it and nothing else. Right? So, uh, I wanted to have this in there for people who are maybe not fully comfortable looking through monitoring tools, uh, maybe not fully comfortable, um, you know, like digging into things, figuring out what was actually wrong and when, uh, and, uh, you know, just maybe people who are maybe a little bit more geared towards, you know, like, or gotten used to over the last couple few years, like having chats with LLMs about things and figuring things out with them. Right? Cause I mean, they’re not always right the first time. Sometimes you get a little poking and prodding and you got to, you know, nudge them a little bit, but you can usually get there. Right? So this feature is opt-in. Um, it’s only running locally. It is not like some big broadcast web server out there in the world. Uh, it’s just running as a local host on your machine. You have to turn it on. Right? Again, I don’t want to alienate anyone who hates the robots. Uh, but if you, uh, find yourself using robots more and more in your day to day work, you can use this here. Um, and then you can start talking to your performance data with the tools that I’ve made available to, um, to, uh, basically expose the tables and views. Uh, if you’re on the full version in the performance monitor database, so only that database. And if you’re on the light version, it’s just talking to the local DuckDB database where all the data got collected. So you have nice sort of like, like pre arranged time series data, the, the model knows the schema and everything, and it can just talk directly to that. And it can look at everything in there and it can tell you like, you know, some pretty neat things about it. There are a whole bunch of tools available for, uh, different things like discovery, weight, CPU, queries, memory blocking. There’s a whole bunch of stuff. There’s like 30 or something tools available to it across all the different data collectors. So you have a lot that you can look at and dig into. And what you have to do, and I’ll, I’ll show you what this looks like is you have to, um, enable the MCP server, right? So again, not the DOM by default, you turn it on. Uh, and then depending on which version you’re on, the port might be different, but it’s unimportant, unimportant ports. Uh, and then you have to go into the LLM of your choice. Um, you know, we are a strong Claude household. We’re a Diet Coke, Claude, um, Lagavulin, like we have strong allegiances. Uh, so this is what we’re doing. If you use something else, as long as the LLM that you use is, um, MCP compatible, it’s fine. Right? And then you just start asking questions about, what’s been going on with your servers. So you, and you can ask it a lot of just normal conversational questions. Like if you like, instead of having to like, you know, like someone’s like, Oh, the server was slow yesterday. And you’re like, ah, you’re going to go look at yesterday. What the, come on, give me a break yesterday. You’re telling me today, like, if you were to go like do this on your own, it might take you five or more minutes to, you know, like, you know, either like, like, let’s say you’re lucky enough to have a free open source monitoring tool. It might take you five or more minutes to like go through all the charts and graphs and dig into stuff and take screenshots and make notes and pull things out and go like, you know, the normal sort of like outlining of things. But with Claude, you can ask it pretty quickly and get some pretty fast answers. It’s maybe not like, you know, as fast as you would like it, but it’s still reasonably quick. Uh, if you want to check this out, go to code.erikdarling.com.

It’s the performance monitor repo. Uh, and again, totally free, totally open source. If your company does want to, um, get support or some contract guarantees with that, I have things available at training.eric.com to, um, to do that. So let’s go look, uh, real quick at, um, first let’s look at the, uh, the settings that I’m talking about. So, um, if you want to look at that, uh, like I already have mine turned on. So this is the MCP server. Um, and if you go in here and you’re using Claude, you can copy the setup command right from here. It tells you like, just like, just the, gives you the little one line command to add the MCP tool. If you use a different LLM and you want to like include like the, the, the correct command to do it with whatever you use, you can submit it to the documentation. I just don’t use anything else. So that’s how you set that up there. And I’ve pre-baked a little bit of a, um, conversation with our dear friend Claude, uh, via the MCP server because, um, you know, like it’s not really a lot of fun sitting around like, like watching an LLM, like spin its gears with stuff. You get bored. You’ll look at something else for a little while, right? Oh, I don’t know. Gamble on, gamble online or something. So no, I just started off like, Hey Claude, can you see my MCP server? It was like, yeah, I see that. Look at that. It’s connected and working. You have seven monitor servers, but only SQL 2022 is available. That’s fine. All right. And then I asked Todd, Claude, not Todd. I don’t know who Todd is. Todd is not cool. Can you list all the tools? Right. Let’s tell me what you have. And then it, so we have a list here of all of the available tools that it has to look again, just at your performance data, not at anything else, right. Just local to that. Um, so don’t, don’t think that I’m sending queries off to like other databases and collecting things. And I’m going to, you know, I don’t know, Ashley Madison, you or something, but these are all the tools that we have out there. And, uh, you know, like I, this is just me, like mostly hitting tab, uh, because there’s like autocomplete stuff where it’s like, well, what, what, what, what’s it, what should I do here? And so I just said, yeah, run a health check on SQL 22, 2022. And it went, cause I’d run a hammer DB TPCH workload on, on SQL 22. You might see some fun spikes and stuff in there, uh, where the queries are running. And it went through a whole bunch of the tools and it said, well, you know, right now things are normal. Um, I ran through, um, all sorts of things and, you know, your server was idle most of the day, but you did have a big spike in there for 33 minutes. 33 minutes is exactly how long my TPCH workloads for, right. And it even says, it looks like a hammer DB, right? It’s like, it knows. It’s like, I got you, buddy.

All right. Uh, it’ll tell you about weight stats. Uh, you know, it says, wow, yeah, 2009 deadlocks today. That’s, that’s a lot. You might want to think about that. All right. It’ll give you a breakdown of how, like what memory looks like file IO running jobs. Um, so, you know, we’re all good there. And then, um, it’s like, you know, there’s some things that you might want to think about, right? Like 2009 deadlocks, 32, 33 minute spike, but you know, whatever. And I was like, Hey, yeah, you know, Claude, um, well, where was it? Uh, there was, uh, there was another prompt.

What was the longest running query in the hammer DB workload? And it comes back with some information in here, right? It’s like, it was a TPCH, TPCH workload, right? Tells you this was the longest running query. It gives you information about all the stuff that we have collected about the query, right? Like all this stuff that make you, you know, you would have to go and look at and like, wait for a tool tip. It’s all, it’s all a nightmare. Right. Uh, and then like, you know, say, okay, well, you know, that’s cool. I can, I can go, I’ll, I’ll make a note of that. I’ll go work on, uh, query 21 later. And then we can say, well, you know, this is the next one. Dig into those 2000 deadlocks. And it did, um, I had a little boo-boo there with some JSON, but that’s okay. And then, um, you know, it tells you what happened, right? And so all 2009 deadlocks are from the TPCC workload, right? And it tells you all the stuff that was going on there and the two deadlocks that were, uh, that were the two queries that were involved in the deadlock, right? So new order and, uh, execute, yeah, new, well, new order and delivery. Huh? See, we like new order too. It’s a good band, right? Uh, so they take some X locks on these and delivery takes some X locks on these and they acquire page level locks in the opposite order, right? And so we just get all this information.

We can say, cool, I can take that information and I can do something actionable with it, right? So, uh, this is just a sort of basic example of, um, you know, what you can do with the MCP servers that are built into my monitoring tools so that you can do sort of performance investigations without breaking a sweat, running a bunch of scripts, like gathering a bunch of query plans, query texts, looking at like 50 million different metrics, you know, trying to corroborate a whole bunch of stuff together. Uh, and, uh, you know, again, if you’re uncomfortable, you know, doing that sort of thing, or even if you’re uncomfortable, just looking at charts and graphs and figuring out when things were crappy, this is a very, very easy way to just talk to your data and get some answers about it. Anyway, thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I hope that you will try out my wonderful, uh, free open source SQL Server monitoring tool.

Again, that is at code.erikdarling.com. Uh, so I would encourage you to head over there and check that out. All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: Which Version Is Right For You?

Free SQL Server Performance Monitoring: Which Version Is Right For You?


Summary

In this video, I delve into my new, completely free, open-source SQL Server monitoring tool, discussing which edition—full or light—you might find most useful based on your specific needs. The full edition creates a database and agent jobs to continuously collect data, offering complete control over the collection schedule and allowing for easy customization of stored procedures. It’s ideal for scenarios where you need constant data collection and want the flexibility to manage the monitoring tool as needed. On the other hand, the light edition uses an embedded DuckDB version inside itself, collecting data only when it’s open, making it perfect for quick triage or situations with limited server access, ensuring that no data is missed while you’re away from your computer. Both editions offer alerts and notifications for critical issues like blocking and high CPU usage, providing a seamless experience regardless of which edition you choose.

Chapters

  • *00:00:00* – Introduction
  • *00:06:28* – Features and Capabilities
  • *00:10:37* – Setup and Installation Process
  • *00:13:59* – Platform Support and Configuration <– Thanks, AI

Full Transcript

Erik Darling here with Darling Data, and we’re going to continue talking a little bit more about my new, completely free, open-source SQL Server monitoring tool, and we’re going to spend some time in this video talking about which edition of the monitoring tool you might want to use, because there are two of them. They have two somewhat different design philosophies and two somewhat different, I think, usage patterns, it’s important to figure out which one you might find the most useful. So, there’s a full edition, which I believe I talked about in the last video a little bit, it creates a database on your server. It creates some agent jobs on your server, and it starts pulling data into that database on a schedule. You have complete control of the schedule. You can change how frequently it runs, you can change whether things run or not, and even if there’s, you know, like, you want to change one of the stored procedures, you can do that, right? You can, like, the code is right there, it’s not encrypted or anything, nothing’s hidden from you. Again, completely naked and vulnerable to the world. There are two ways to install it. There is a command line installation process, and there is a GUI front-end installation process, whichever one you find easier, or if you, I mean, if you need to do something programmatic to a bunch of servers, then the command line is probably for you.

It spins up, like I said, some agent jobs, three of them. One of them is to do data collection, one of them manages data retention, and the other one, like, goes and looks for if the agent job, either of our agent jobs is hung, and knocks them out so that nothing, nothing weird happens, right? Because it’s important to monitor the monitoring tool. But then, what you do is you open up a dashboard, and you point it to the server, and the dashboard finds the performance monitor database, and it goes and starts pulling data from it. So, it’s pretty easy there, and there’s pretty standard monitoring stuff. There’s a bunch of tabs, and, you know, you look through them and make a bunch of wise decisions.

The light edition is a little bit different. The light edition does not create a database on your server. The light edition uses an embedded sort of DuckDB version inside of itself and starts pulling data in. This is where things start to differentiate a little bit. The full edition is going to be constantly, via the agent jobs, pulling in data. The light edition only pulls in data when it’s open. So, if you close down, if you close light edition, if you, you know, go on vacation, shut your computer down, just kidding, who, no one does either of those things.

It will not be collecting data while you’re away. Probably, if you, you know, don’t want anything weird happening, you might want to stick, if you, and you, like, if you’re, like, depending on, like, what sort of access you have, you might just want to stick light edition on a jump box or something and have it constantly pulling stuff in, and that way anyone can go in and look at the dashboard when they want. I mean, you could do that with full edition, too, but, you know, some other stuff, you know, like I talked about in yesterday’s video, you know, when I was first designing the full edition of this, I did sort of picture scenarios where someone would want to, you know, maybe have the monitoring database, and they would, they might want to, like, just, you know, back it up, zip it up, send it to a, send it to someone for analysis without having to give them full access to the server, the whole other thing, right?

It’s a fairly easy thing to do, and, you know, if you’re, if you’re collecting data for, like, a week or a few days or something, there’s not going to be a huge database backup to, you know, manage and install. It’s not like a multi-terabyte database or anything. We do store a reasonable amount of data in there. By default, it’s up to 30 days, but you can manage that as well. If you want more data, less data, you can change the retention policies.

Right. So, which edition kind of fits your needs is going to depend on how you use things and what your level of access is and probably what your job title is. You know, like if, like, you know, like the full edition, if you want, like, you know, like, guaranteed that it’s always going to be up there collecting data, then, you know, you probably want the agent jobs running collecting stuff. Light edition, if, you know, anything happens to it, if, like, you know, you lose network connectivity, you shut your computer down, Windows update kicks in and ruins your life, you’re going to have gaps in collection.

If you want to do a really quick triage, the light edition is probably better for you because it just, like, you can just open it, point it at a server, it’ll start immediately pulling in data. It does make an attempt to pull in, like, a bit of historical stuff to, like, populate some of the dashboards, but, like, not everything in SQL Server plays nicely with that.

Wait stats and other perfmon counters, other things like that, they’re, like, aggregate since the server started up. So, you know, if you need to support Azure SQL database, then the light edition is the only one that does that because, you know, Azure SQL DB only supports one database, so we couldn’t create a monitoring database and start pulling in data.

So, you know, depending on your level of access to the server and what you’re allowed to create and do there, the full edition might just not be possible for you, like, depending on, like, if you’re, like, if you’re not allowed to, you know, like, create agent jobs without, like, you know, compliance and all the other stuff getting involved or whatever other change management stuff going in, light edition’s a little bit easier to sneak in and say, oh, I don’t know, I’m just checking out some stuff here, let’s run an SP, who is active.

So, you know, you know, like, for the, like, consultants, though, like, people out there who might be interested in using this, you know, like, it is, I think the full edition is useful for a lot of reasons, like, you know, again, it’s always collecting data. Someone can take a backup and send you the backup.

You can run analysis on whatever, like, whatever data is in there without someone having to give you full access to the server, schedule a call, whatever it is. Light edition, you know, if you, you know, like, like, are getting started with a client or, like, you know, like, I don’t know, like, let’s say you’re me and you sign a client and then you’re like, well, you know, we can have our first call in, like, you know, three, four days or something, why don’t you get this running, collecting some data so we have stuff, like, good stuff to look at when we kick things off.

Like, I don’t have to sit there and, like, you know, run a bunch of scripts and collect stuff and, you know, dig through things and be like, oh, well, these wait stats, oh, the server’s been up for 2,000 hours. Well, I don’t know how, I don’t know which of these wait stats is relevant anymore. That really bad one could have happened 1,900 hours ago and never again, right?

Like, it’s just a nice way to sort of get some preamble data before you start talking to someone. We do also provide some alerts and notifications. So for stuff like blocking, deadlocks, high CPU, connection changes, stuff like that, like connection changes, meaning, like, server being unreachable, important stuff, kind of.

Those do generate notifications. And the notification thresholds are all configurable as well. It’s not like, you know, like everything in here, I try to make it so that you can customize it to make sense to you as much as I can.

The only thing that would not be easy there would be managing two separate presentation modes. Right now, the dashboard is all dark mode. Light mode hurt my eyes while I was making it.

It’s a lot of late nights on this. So, like, it will fire off notifications. And there are a few different ways to get the notifications. There’s this, like, you get system tray pop-ups when stuff happens.

And you can set up SMTP to send you emails. I tested that with a weird, like, dev SMTP thing. And it worked wonderfully.

If you, you know, again, if you hit any problems, just, you know, open an issue on GitHub. I’ll get to it as soon as I can. And then the other neat thing here is that you can just, you can either acknowledge or silence notifications. So, like, if, you know, you, if you’re, like, you know, looking at the server and you’re, like, well, there’s 70 deadlocks.

I’m, okay, I saw those. You can just click acknowledge on them. If there’s a specific notification you don’t care about, you can just silence it. You can silence notifications for an entire server.

Like, whatever works for you. And then also both of them include MCP server analysis. Now, something that seems to be confusing to people is in my setup stuff, like, I use Claude. And so the setup instructions that I have to add an MCP server are geared towards that.

You can use it with any LL. You can use it with whatever you want that supports MCP servers. So it’s not limited to Claude.

You can use whatever you want with it. It just really is, like, I use Claude specifically. So that’s what I geared stuff towards, right? There’s nothing specific about an MCP server or, like, the descriptions or the tools in here that are Claude only. You can use whatever you want with it.

And the cool thing here is that, like, as far as security goes, it’s not like, you know, it just binds to your local host. So, you know, you’re not sending your data all over the place. You’re asking specific questions about just the data that has been collected by the monitoring tool.

Like, it can see database and table names and stuff because that’ll be in various collection things and, like, see server names and stuff because that’s what you’re connecting to. But, like, it’s not going out to your user databases. It does not have permissions or privileges beyond the collected data set.

So nothing outside of that gets touched, looked at. It’s just collected performance data. And that’s nice, too, because it really helps to narrow the focus of what goes on in there. It all works off tools that are specific to the tables and schema that I have and some views that I have set up in there.

So it’s not like, you know, it has to go and crunch numbers itself. Like, everything is pretty well laid out. So you can ask, you know, you can ask questions of your collected performance data and make life very easy for doing analysis because you’re not, like, again, if you’re not the type of person who does a lot of performance monitoring regularly or even if you are and you just want to get quick answers without looking at, you know, a bunch of dashboards and everything else.

Like, if you want to, like, sort of get a, like, like, you want to get a story to tell before you start looking through things, you know, it’s very easy to do that, right? So that’s probably about good here. If you want to check, oh, I guess there’s a little bit of platform support stuff to talk about.

All this stuff is somewhat configurable. You know, it’s a little annoying that on RDS, if you want to use the block process report, that’s outside of my control. You need to use a parameter group for that.

Azure SQL DB is fixed at 20 seconds. But for on-prem in Azure managed instance, I will auto-configure the block process report stuff for you, run the SP configure stuff, create the extended events to read the block process and deadlock reports. So, like, I do as much setup as I can, but that’s one thing that kind of sticks out.

So, like, the default trace stuff, like, if you’re, if you have it turned on, it’ll read from it. If you don’t, it’ll just say it’s not turned on. There’s also a specific trace that I use.

Like, one thing that I really liked about SQL Sentry back when it was good was that it used trace to find stuff. So, you can get, like, some, like, good performance metrics pulled out of that. And so, you know, I use that.

So, I have something similar set up in mind to give you a good experience in that realm. But, you know, pretty, pretty simple stuff there. Really, the only, really the big thing is that the old, like, Azure SQL database that’s only supported by the Lite version, again, because of the database creation thing.

So, if you want to check that out, go to code.erikdarling.com, and you can download it for free. If you want to contribute to the project, it is open source. If you want to support it as an open source project, that is available to you.

And if your company requires some sort of support contract or you need to have some other vendor validation stuff before you run software in your environment, the purchase terms for that are at training.erikdarling.com under the monitoring header. So, just head over there if you require additional stuff before you get going with an open source project. Anyway, thank you for watching.

Hope you learned something. Hope you enjoyed yourselves. I hope you try out this free open source SQL Server Monitoring Tool, and I will see you in tomorrow’s video where we will dig a little bit deeper into some of the inner workings of it. All right.

Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Free SQL Server Performance Monitoring: Introduction

Free SQL Server Performance Monitoring: Introduction


Summary

In this video, I introduce my new free performance monitoring tool for SQL Server, which was developed after a bit of boredom during the holidays and a desire to create something simple yet powerful. The main issue I’ve noticed is that many existing tools are either overly complex or too expensive, making them less accessible to smaller teams or individuals. My goal with this tool was to provide a straightforward solution that doesn’t require setting up VMs, dealing with firewalls, or managing domain accounts—essentially streamlining the process for anyone looking to monitor SQL Server performance without the hassle of a full-blown enterprise setup. The tool supports various editions of SQL Server and even Azure SQL DB, offering both a full version with database logging and an easier-to-use light version that’s self-contained within DuckDB. I also highlight how it integrates machine learning models for those who prefer not to dive deep into dashboards, providing quick answers through natural language queries.

Chapters

  • *00:00:00* – Introduction and Announcement
  • *00:09:38* – Alert History Tab
  • *00:10:41* – Dashboard Overview
  • *00:11:06* – Query Execution Metrics
  • *00:12:11* – Performance Metrics and Graphs
  • *00:13:13* – How to Check Out the Tool
  • *00:13:41* – Next Steps and Conclusion

Full Transcript

Erik Darling here with Darling Data and in today’s video we are going to talk a little bit about my new free performance monitoring tool for SQL Server. It’s something that I started cooking up a little bit after Thanksgiving because it was the holidays and I was bored and I wanted to take advantage of a little bit of like sort of supplementing what I can do with what, you know, our new robot friends can do. For a long time, you know, I had like this idea in my head but I was not able to talk anyone who knows about like C Sharp or WPF or any like any of the other stuff into working on something totally for free. The good news is, Claude will do lots of stuff for about 200 bucks a month. So that’s what I put Claude to work on. Now, I have a bunch of videos and I’m going to go over a lot of like the features, usability stuff, like what it monitors, what it grabs, how it works, stuff like that. But today’s video, I just want to give this like a quick intro overview, stuff like that. I’ll just kind of tell you why I decided to do all this stuff. So the problem that I saw a lot working with clients is that like either like you had enterprise monitoring tools that were very expensive and then you had a lot of like free or like free or lower price monitoring tools that still had like all the sort of complexity of use and installation that the enterprise tools had. And a lot of people just like they’re not like they don’t want to set up a VM with a SQL Server or whatever on it and start like, you know, like setting up like domain accounts and dealing with firewalls and getting everything connected and like pulling data over. And so I just wanted something that was pretty simple. Now, like granted, like what I started out with was sort of like a full flick thing, right? Like not like enterprise install thing, but like, you know, like create a database, start logging data to it, have the dashboard pull from that. And then sort of like midway through that, I was like, wait a minute, I need I want to have something like a little bit simpler to for other people. But we’re going to talk about all that stuff.

So the thing with my monitoring solution is it’s totally free. I do offer sort of support tiers, paid, like not paid versions, but just like paid support contracts for it. If your company needs that for compliance reasons or whatever. That’s all available via training.erikdarling.com. You can check that out over there. It supports SQL Server 2016 to 2025, Azure SQL Managed Instance, Azure SQL DB and AWS RDS. Only the lite version supports Azure SQL DB though, because this one doesn’t have a create database dependency alongside it. So like I brought a lot of sort of the stuff that I care about that I monitor that I look at into this monitoring tool, which is something that like I felt a lot of other monitoring tools were kind of missing. Most of people, most monitoring tool shops do not have like performance people who are like doing that stuff every day working on those products.

And you miss a lot of the sort of like important stuff that people who do performance tuning work would care about and want to look at. So that’s kind of why I, I wanted to do this. Um, you have two additions, you have a full edition, um, which, you know, like creates a database on a server and some agent job that then starts logging via the agent jobs to that database, which is, I know not ideal for everyone. Um, it is like as a very specific purpose, like in my world, in my life, where, um, where, um, you know, if someone who I’m working with is having like performance problems and they can’t give me like remote access to the full server, whenever I want it, we have this thing running over there and you know, well, they don’t have to give me access to the full server. They can take a backup of the performance database, share it with me. And I can look through the data in there. So that gives, that gives me like, you know, something helpful, which is kind of why I started with that one there, but also just like, you know, like, like, like having that dashboard available, you know, that dashboard can turn on a jump box somewhere, connect to just that database. And I can go in and review stuff in the dashboard too. So for people where there’s like, you know, some additional layer between like what I can do remotely and, um, like what they’re allowed to give me access to, that’s really helpful.

Then like, like I said, like midway through doing that one, I was like, well, I’d also like to support Azure SQL database and, you know, um, maybe make something a little bit simpler that doesn’t create a database dependency. And, you know, I have this, this absolute love for the DuckDB database. And so I was just like, Hey, let’s build off what we’ve already done for the full one. And let’s just create like a self-contained monitoring tool. Right. We use DuckDB because it’s real fast, right? Like, you know, like before we could use, end up using like SQL light, if you wanted an embedded type database music, DuckDB, which is insanely fast for both the inserts that we need to do from the data collection on the SQL Server.

And also like the queries that run internally and it’s like made for like, you know, reporting queries made for analytics. So then just made total sense to like use that instead. There’s also kind of a neat thing in there where, um, both of these, uh, both of these monitoring tools have built into them. You can enable it. It’s disabled by default, right? And I didn’t want to like, I’m not forcing AI or LLMs on anyone, but if you just want to like, if you’re not the type of person who like is good at looking through dashboards, is good at digging through data, it’s good at identifying problems and stuff, there are MCP servers and the MCP servers allow you to have what the LLM of your choice.

I use cloud. I have a strong cloud preference. You can use whatever like LLM you want to just hook into those MCP servers and ask questions about what data they’ve gathered. So you don’t have to give them access to your entire SQL Server and all your databases and all your schema and like all the crazy stuff that can live out there. They can only like the MCP server for the full full edition will only talk to data in the performance monitor database, right?

It has very specifically crafted read tools to only touch that data. You can’t run ad hoc queries with it. And it’s all set up to work with the data model in the performance monitor database. The light version is even further detached because that is only talking to the DuckDB database where all the data gets pulled into. So you can use an LLM and you can just say, hey, someone was complaining yesterday at noon about something. Tell me what was going on.

And it will go through all the data and be like, hey, well, we found this blocking problem. This query was running for a long time. All the other stuff. All the other stuff. You can also get alerts from both monitoring tools. They will show up as system tray alerts. And also you can also set up an SMTP server. It’s built in. So you can you can send yourself emails to with with with alerts that come up in there.

Right now it’s like CPU blocking deadlocks, a few other things, but like, you know, stuff that like, you know, you would maybe want to get a pretty quick notification about if it was going on. So like I said, we pretty much support all things with both of them. But light is the only one that supports Azure SQL database because the full version does have does create a database and start logging data to it and uses agent jobs.

And the story for that with Azure SQL DB is impossible to tell. Right. Because I don’t want to like start creating tables in the one database you’re allowed and like go through elastic jobs and the rest of that nightmare. So there’s a light version that just sucks data out.

The requirement permissions for the install and creating the agent jobs are a little higher priv than I wanted to do. But in general, all you need is like view server performance state to pull out the data and start logging to it after that. So to install, you need a little bit more. But the general data collection, you don’t need much.

And then for the light edition is just either view server state or view server performance state if if you have that one available. If you want to check it out, go to code.erikdarling.com. It’s the performance monitor repo on there. It’s doing pretty well so far.

I have over a thousand downloads, a bunch of stars and forks and a pretty active community of people submitting people submitting issues, stuff getting worked on fixed. So very, very excited about this. I don’t want to just talk about it, though. I do want to show show you a little bit of it.

So when you first open up the this is the full version, the light version is being worked on at current. And I just want to use this one. But this one, like when you open it up, you get an overview of all of the servers you’ve connected to it. You see, I have a bunch of servers offline here because I don’t have all my VMs up and running, which is totally fine.

I don’t need them. But then I have SQL Server 2022 online and running. You also have an alert history tab. So like all the alerts that you’ve sent, you can see what time they were, what the alert was, all that stuff. It’s a good way to sort of track like when you had problems.

So if you want to like MCP and do stuff, I’m going to show you the MCP stuff in a later video. But if you go into back to the overview and you double click on any one of your servers, you get, you know, like, again, people love dashboards. And I like dashboards because it gives me a visualization of the data. Right.

And I don’t want to pretend that this thing like those dashboards like differently or better than most other monitoring tools. But if you want to just kind of like get a picture of what like, you know, the metrics on your server over time look like these things paint real pretty pictures. A lot of people like having screenshots and stuff to like like, hey, this is what was going on and they can point to things.

It’s like it’s a nice sort of visual helper. There are a whole lot of tabs in here. I’m just going to do like a high level of these at the moment.

But, you know, you get like a server overview. It’ll talk through critical issues, server configuration changes, database configuration changes, if anyone’s adding or removing trace flags, stuff like that. You also get some insight into if your collectors are running well, if any of them are having problems, stuff like that.

Query execution stuff as well. That’s a, you know, stuff that I usually spend the most time with. You know, you’ll get the full version runs basically just SP who is active to look at active queries.

The light version runs a slimmed down version of just a query that pulls them. But, you know, like there’s a lot of tool familiarity in here, right? There’s a lot of stuff that you’ll see in here that you would you would be used to seeing.

And like, you know, like like basically like if you ran like a script in SSMS, you would see very similar results. So if you have some familiarity with these things already, you’re in a pretty good place because you already know kind of what the results look like.

So we would pull from, you know, query stats, procedure stats. We pull from query store. I don’t know how many like paid monitoring tools out there are actually touching query store. But damn it.

I wrote SP quickie store. We’re hitting query store, right? Not messing around with it. You know, we go over resource metrics. You would see all sorts of like nice stuff about like how your server looked over time. We graph out weight stats for you individually, right?

It’s not like just a conglomeration of like, you know, weight stat colors and blah blahs and whatnot. You can look at weight stats by total weight time by average weight times. You can see like if you had like any like big like spiky weights by like what weights that took a long time to run versus like weights that accumulated a lot of time like in a different way.

There’s all sorts of stuff about tempdb, file.io, perfmon. You name it. It’s in there.

You know, I break down a lot of stuff about memory usage. You see memory grants. That’s not a very exciting graph at this point. But anyway, it’s like if you if there’s something about SQL Server performance that you care about, it is in there. Right.

You get I guess they don’t have any blocking going on, but at least it didn’t show up quickly enough for me to care about. But, you know, I spike out like, you know, lock weights over time specifically under blocking blocking like like how many blocking events, deadlock events, like durations for blocking and deadlocks. And then over in the system events tab, this is all data that would come from SP Health Parser.

A lot of these are going to be empty for most of you. And that’s totally fine because the more of these are empty, the better shape your server is in. Like you don’t want to see anything under corruption events.

You don’t want to see stuff under contention events. You know, we have some severe errors in here where I was killing queries just to have some one of these graphs show stuff. But, you know, there’s like there are there are things that will pop up in here.

There are things that are useful in here, but this is all from the system health extended events. So, again, if you don’t see anything in here, you’re that means something good for you. Right. It means your server is not doubling over and crying.

So anyway, just a quick introduction, sort of like an announcement video to talk about things a little bit. If you want to check this out, again, it’s code.erikdarling.com. It’s the performance monitor repo.

There’s a big readme file in there with lots of instructions and stuff. And if you have any questions, comments, you haven’t run into any problems with it, open a GitHub issue. That’s where I’m doing all the support and fixing and stuff.

So check that out. And tomorrow’s video, I’m going to go over a few more things in deeper detail about how the tool works, how to use it, things like that. So anyway, thanks for watching.

I hope you enjoyed yourselves. Please check out this nice free monitoring tool that I’m giving everyone. And that’s about it. All right. See you tomorrow.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

SQL Server Performance Office Hours Episode 53

SQL Server Performance Office Hours Episode 53


Summary

In this video, I dive into answering five community-submitted questions about a wide range of topics related to SQL Server and more—ranging from optimizing long-running queries to migrating between different versions of SQL Server. I also share some exciting updates on my free monitoring tools, which are now available in two versions for various needs. It’s been a chilly February, but I’m looking forward to the warmer days ahead when I can rejoin the live events circuit at data tune Nashville, Data Saturday Chicago, and more. If you have any questions or want to participate in this Office Hours session, head over to the video description where you’ll find all the links to ask me directly and learn more about my consulting services and training options.

Chapters

  • *00:00:00* – Introduction and Overview
  • *00:04:27* – Monitoring Tool Explanation
  • *00:13:56* – Parameter Sniffing and Plan Stability
  • *00:28:49* – Concurrency Safe Index Maintenance Risks <– Good job, AI
  • *00:38:59* – Conclusion and Next Steps <– Good job, AI

Full Transcript

Erik Darling here with Darling Data. I’m in the mood for, well, I guess it’s Monday, so I’m in the mood for Office Hours. This is where I answer five, as long as my fingers work, five community-submitted questions about, well, you know, SQL Server, Love, Life, Fitness, Dietary Restrictions, I don’t know, about tramp stamps recently, so pretty good spread of stuff here. So, you know, just send them on in. If you want to ask me a question, if you want to partake in this circus, you can find that link down in the video description. You’ll find all sorts of wonderful, helpful links there, where you can hire me for consulting, buy my training, become a supporting member of this channel, and, like I said, ask me Office Hours questions, and, of course, you know, the magic, like, subscribe, tell a friend, we’re all happy. And in case you didn’t know, I released a free monitoring tool. I mean, actually, I mean, technically, I released two of them. There’s two versions of it, depending on what your needs are. But they’re both totally free. They are open source. They don’t, you know, you don’t need to, like, sign up via email, or they don’t phone anything home to me. It’s just a bunch of things that run, collect performance data, from SQL Server, things that you would care about, and part of a performance investigation. We got, like, a Nox style dashboard, so you can see which servers are up running, if any of them have any crazy stuff going on. And if you are AI ready, like I am, right, because, because you’ve been getting AI ready with Eric quite a bit lately, then you’ll be happy to know that there is a built in MCP server, so you can actually talk directly to your performance data. You don’t have to give the, you don’t have to give this access to, like, all of your SQL Server data, right? Like, databases and all the system, sys DMVs and stuff. It just talks to the data that it has stored, which is a nice way to have a very defined, well-defined set of data for the, to ask questions about.

So you can say things like, uh, user complained about something being slow yesterday, around noon, what was going on? And it can, it can use all the built-in read-only, no going outside the boundaries, no running weird queries, uh, by accident tools, to tell you what was going on at noon yesterday. So, uh, very excited about this. Um, it’s available at my GitHub repo. There’s a, a long link there, but, um, anyway, very exciting stuff. And, of course, you know, uh, the reason why these things can get built is because it is cold and awful outside, and I don’t want to go anywhere as soon as I walk outside and I can see my breath in the air.

I just want to, like, hide in a, uh, the, I don’t know, a couch cushion somewhere and not, never, never think about outside again. But, uh, eventually, outside will become nice and tolerable again, and I will rejoin the living. Uh, I will sprout, I will blossom from the ground, uh, and I will be at data tune Nashville, March 6th and 7th. Data Saturday, Chicago, March 13th and 14th. SQL Day, Poland, May 11th to 13th. And data Saturday, Croatia, June 12th and 13th.

And, uh, that’s going to be my, my world tour so far for the year. Uh, so I’ll be at all those places, and you should be at those places, too, because I, I will be teaching pre-cons at all of them on Advanced T-SQL. And, uh, it would be just such a joy, a treat, and a pleasure to see you there. I will sign your, your copy of the monitoring tool, autograph it for you. It’ll be fun, right? Be a good time. Anyway, it’s still February here. I still hate life a lot. So, uh, let’s, let’s answer some questions, right? Because that, that’s what keeps us warm.

And, uh, by the way, this, this is the, the, the monitoring tool of doing stuff, right? Like, look at these, look at that weight stats graph. I, I could, I don’t have anything running on here now. I was running some HammerDB, HammerDB tests on this thing earlier, right? And, uh, we got some, got some nice graphs, and we got a whole bunch of stuff about what was going on on the server, right? We, like, you got some CPU, we got some memory, we got all this fun stuff happening, right? Even got some blocking. Look at that. Oh, well, that, then that’s not that interesting, right?

Yeah, that one’s a little lame. But, uh, we have all sorts of good stuff in here. So, um, you should go get this for free and start monitoring your SQL Server for free and stop wasting money on those awful, either, well, I mean, those awful enterprise monitoring tools that are way too expensive or, um, not having monitoring at all, right? Neither one’s a very good idea. Yeah, I got all the good ideas. Anyway, um, here’s the first question now. I have a select statement running for 30 plus minutes. Sounds like you could use some professional help. Sounds like you could use the help of perhaps a young and handsome consultant with reasonable rates, 30 minute plus query there, fella.

Uh, then I manually kill it. Then it’s in a rollback for five, 10 minutes. What’s rolling back? Well, it’s a select query. You don’t have to roll back CPU. Uh, you’re probably not going to be rolling back memory. Uh, so really, uh, the, the most common thing that I see cause, uh, select queries. And I’m just going to assume that you’re a nice person and you’re not like, but it’s a select into cause that would be, that’d be annoying. Uh, so really the most common thing is if your query has done IO of some variety, that could be a spool in the query plan. It could be a spill in the query plan and, uh, that stuff has to clean up after itself. So if you were running for 30 minutes and doing all sorts of nonsense, then probably at the end of that 30 minutes, you had to do some more nonsense to revert those things.

All right. Do, do, do, do, do. Hi, Eric. Hi, you, whoever you are. You know, people never tell me their names, do they? I’m just out here, out here in the wind, fully exposed. Uh, I had a linked server query running for weeks. Where, where am I in the world? Uh, that ran in minutes locally. The remote view I got delivered had four scalar subqueries and the select list, each doing a nested loops join against a 5 million row table. Is what happens that 5 million table get pulled unfiltered for each subquery execution per row. So roughly 5 million times row count times four in data transfer. Um, I would really have to see the query plan in order to tell you what was going on, but that sounds rather unlikely. Um, most of the time, SQL Server just will pull all the data across, uh, for a remote, for a remote query.

And then sort of hit the stuff, uh, locally, but it really all does depend on what the execution plan looks like. So if, if you feel like sharing that in some form or fashion, then I can provide a better answer for you. But until then, uh, it is impossible to tell, but the optimizer usually isn’t quite that dumb. That being said, remote queries really, you get, you, you, you get what you, you get what you get with those, right? You get exact, I think more like you get what you deserve with remote queries. So, all right. Hello, Eric. Hello. Another one. Uh, watched a few of your videos on SQL performance. Oh, you know, everything now then. Uh, actually I have one issue not able to bind the solution to. Oh, do you actually? Uh, we recently migrated from SQL Server 2016 to SQL Server 2022 query on one table is taking indefinitely long earlier. It used to complete in 10, 20 minutes. Jeez. Like why? Um, I’m, I’m, I’m right here.

Uh, I can, I can, I can make these queries take less than 10 to 20 minutes, but now it’s running for three to four hours with no result. I tried optimizations like update stats with full scan, tried using index hints for join, rebuilding the index. I, I don’t know whose videos you’ve been watching. I, this is, this is, this is not my advice. I’m, I’m a little insulted here. But nothing is helping would be great if you can give some pointers to investigate the issue further. Um, so, I mean, again, uh, the, the execution plan is going to have reveal most of the mysteries to you. Um, if you, if you still have the, the 2016 server around or any data from it, or you’re still able to access it in some way, you might be able to find the query plan for your, your query over there that you could use. And you can, you can, you can compare it to the, uh, the, the, the current query plan that you’re getting, but the most valuable thing you can do is in open up SQL Server management studio, hit, hit the include actual execution plan button and start running your query.

And the reason why you’re going to do this is because even though you’re saying it runs for three, four hours and it’s no way you’re going to be able to, I mean, maybe there is, but it’d be boring. Uh, and you probably don’t want to wait for the whole thing to finish. What you can do is you can start that running in one SSMS tab.

And then over in another SSMS tab, you can run SP who is active with the get plans parameter set to one. And you can look at the, uh, in flight actual execution plan for your query and you can start to see where it’s getting jammed and glued up. So that, that’s where I would start.

Um, I would not start with all the other stuff you mentioned. You got bad advice from someone else. I don’t know who you’ve been talking to. You did not get that from my videos. All right.

Let’s see. Oh, I did that wrong. Yeah. Uh, should we prioritize plan stability or optimal plans when those goals conflict? What goals?

Um, yeah. So, oh, I, I get it. So you, you want to know if you should, um, prioritize having one execution plan that works well for everybody, or, uh, if, you know, you’re, you’re in a situation where lots of different, maybe, um, like parameter values, uh, would create lots of different execution plans.

Um, and my answer is you, you know, like it, like you don’t have to have one extreme or the other. Um, often you can like find reasonable ways to bucket these things. Um, you know, that’s sort of like what SQL servers, um, uh, parameter sensitive plan optimization does.

It attempts to bucket things, but it’s, it’s bucketing is, uh, way too coarse. Uh, the bucketing that that thing does is not fine grained enough. So what I would recommend doing is taking a look at, uh, which values you’re passing in, um, are sort of like the, the root of the evil for you.

Right. And maybe you can figure out like some sort of distribution counts for those values. And you can most likely, maybe not always, but you know, most likely in these, in these circumstances, you could find reasonable buckets of things for these, like, let’s just to throw some numbers out there, like one to 10,000, 10,000 to a hundred thousand, a hundred thousand to 500,000, 500,000 to a million.

You know, if you have values beyond that, you can go beyond that. But, you know, it doesn’t have to be, you know, either everyone gets one plan or like everyone gets every plan that they ever wanted.

You can, you can usually bucket these things, uh, and you can use like the option optimize for, uh, syntax to sort of bucket things in a way that, um, that, that, that, that would work for you.

Um, you can also use various dynamic SQL tricks to do that. Um, you know, each bucket gets its own one equals select one, two equals select two, three equals select three. Uh, you know, I’ve, I’ve, I’ve, I have a video called defeating parameter sniffing with dynamic SQL that goes into all that quite a bit.

So what does concurrency safe index maintenance actually mean and what risks remain even when following best practice? So I think you’re asking, how can I do index maintenance without blocking people?

And, uh, no, I’m, I’m, I’m just gonna, I’m just gonna say this once, once, once out there. Um, and, unless you really need to do something to your index, like, like maybe you want to add page compression to your index.

Um, you know, or I guess if you’re like deleting a lot of data and rebuilding an index once in a while would make sense. This, this really isn’t something that you should be doing in a way where you have to worry about concurrency all that much with it.

Like, you know, if you’re, if you’re, if you have a process that’s deleting a bunch of data and doing archival stuff, just do an index rebuild after that. Uh, if you want to apply page compression, you should be worrying about doing that during a maintenance window.

Um, we should not just be, uh, running scheduled index rebuilds because logical fragmentation is a thing. It’s just, it just doesn’t hold the same value, uh, for SQL Server anymore. There are of course, you know, index options like wait at low priority and all the other stuff and, you know, online equals on, but you might not be in a position to use all of those, all of those things.

So, um, if you are, that might make things a bit more concurrency safe, but, uh, really I just find, I find the whole ordeal mostly offensive. Anyway, that’s probably good for me here.

Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And, uh, I think, and I’m going to do some videos, uh, more in depth about the monitoring tool stuff, uh, this week to sort of help promote it, get it out there in front of people and hopefully act as some instructional guides as well.

So, uh, this week we’ll be, we’ll be talking monitoring boy. All right. Thank you for watching. Where’s my mouse? I got to close.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Retrieval Augmented Generation

Get AI-Ready With Erik: Retrieval Augmented Generation


Summary

In this video, I delve into Retrieval Augmented Generation (RAG) and its application in SQL Server databases, particularly focusing on how it can be used to build developer support chatbots. I explain the process of using a stored procedure to retrieve relevant posts from your database and send them as context to an LLM (Language Model), which then generates answers based on community knowledge. The video demonstrates this with a practical example where we build a store procedure that includes source URLs, allowing users to verify information provided by the LLM. I also discuss the importance of embedding questions using AI generate embeddings and highlight how SQL Server can serve as a knowledge base for storing and retrieving relevant content efficiently.

Chapters

  • *00:00:00* – Introduction
  • *00:01:01* – Retrieval Augmented Generation Overview
  • *00:02:02* – Basic Process of RAG
  • *00:02:21* – LLM Verification with Source URLs
  • *00:03:02* – Store Procedure for Embeddings
  • *00:04:26* – Example Store Procedure Execution
  • *00:05:38* – Summary of RAG Workflow

Full Transcript

Erik Darling here with Darling Data. If you have hated all of the vector content that I’ve been covering over the last month or so, your long national nightmare is over. This will be the last, like, dedicated course promoting video that I have on it. Again, that is available on my training site. The full link is available down in the video description. There is an active coupon code called AIReady where you can get $500 off. And in today’s video, we’re going to talk about Retrieval Augmented Generation, or RAG, which is, you know, probably what most of you will have to deal with within, like, within the context of, like, a SQL Server database doing things with vectors, embeddings, stuff like that. So, let’s say, let’s say something like, you can even just pretend, right? You can, you can totally just cosplay as this person that I’m, this fictional person that I’m talking about. But let’s say you’re building a developer support chatbot. And developers need to know stuff about Git, like, I screwed up, how do I undo that? Or, like, I don’t know, like, left join, and left join, what is that? Or, like, maybe even something as ambiguous as, why is my query slow? Which, if you need an answer to that question, let me tell you this, my rates are reasonable, I can answer that for you, like, immediately. Bam, it’s done. But the general flow of this is, you know, you would, like, you know, find some relevant posts in your database. And then you would take those posts and send them as context to an LLM, right? Because SQL Server doesn’t do this stuff internally, right? You need, like, a, you need, you need an LLM in some fashion, where you can send bulks of information to, and then the LLM is supposed to, like, distill some answer based on, like, community knowledge around a topic. And that’s basically, that’s the basics of RAG, or Retrieval Augmented Generation.

SQL Server handles the data storage and data retrieval. The LLM handles the generation, and you have some application in the middle that connects them, right? Just says, oh, I need to, like, send in this search string to SQL Server and find relevant stuff. And then I take the relevant stuff that I found, and I send it to the LLM, and the LLM gets it all wrong, right? So, that’s the basic process there. So, if your chatbot, you know, wants to know, like, you know, you can undo a commit with git reset, users might be like, well, how do you know? Like, can you cite some sources? Can you, can you back that up with anything factual? Are you sure that’s a real command?

Does git exist? Like, there’s lots of stuff. People would rightfully question and ask about, when an LLM does stuff. So, what we’re going to do is use a store procedure to include source URLs that will let users verify the information, right? Because you might, if you’re the type of person who just takes what LLMs say at face value, you might be in for some interesting times in your life.

But, anyway, this is what the store procedure looks like. The name is unimportant, but we’re going to pass in a query. We’re going to define a max number of rows we want back. We’re also going to filter on distance, something that I’ve said a few times is pretty important with vector distance stuff. And then we’re going to, like, output a prompt. Just, this isn’t necessary, necessarily. This is just something for me to show, so I can show you that, like, what the full results would look like.

Inside the procedure, we’re going to make a variable that’s going to hold the embedding that we want, and we’re going to go grab that embedding using AI generate embeddings. I talked through a little bit about how to, like, get that set up in another video, but I’m not going to redo all that here. And then this is the query that runs to return stuff, and then, like, what it’s going to do is, like, you know, the usual, like, here’s the distance, here’s the source URL.

And we’re going to dump that into a temp table called prompt, which hopefully will happen promptly. And then we’re going to, I mean, I’m doing this so it makes sense in this context. You might not do this. You might do something different. But we’re going to say our prompt is, based on the following Stack Overflow questions, answer the questions, and then the query that the user originally passed in, right?

So, like, that’s, like, the question that needs to get answered. And then what we can do is run the store procedure, right? So, we’re going to declare a prompt variable just to hold the output here.

The query that we’re sending in is how to undo the most recent commits in Git. We’re asking for the top 10 results with a max distance of 0.20. And when we run this, what the prompt returns is, I mean, I have some XML content here so I can click and open it for you.

You might choose to do something differently. But it says, based on the following Stack Overflow questions, and all these questions are things that we found using the search string that someone passed in. Like, based on these 10 URLs, answer the question, how to undo the most recent commits in Git.

So, that’s, like, sort of the basics of, like, what you would do to, like, A, like, you have all this stuff stored in SQL Server. You have your vectors. You have your embeddings.

And then, you know, you have your users go in, search for things that are similar. Then you have an LLM build out a response based on what it finds in there. So, it treats, like, treats your database like a knowledge base, right?

You can just search through all this, like, you know, historical knowledge, get an answer, and hopefully it’s, you know, well-founded in reality and the LLM doesn’t hallucinate anything along the way. All things that we hope for. So, the application basically, like, basically, again, the workflow is the user asks, like, how do I do this thing?

How do I undo my last commit in Git? The app embeds the question. We use AI generate embeddings. I guess you don’t need to know it’s in Video 20.

That’s stupid. You could also call an external embedding, but I don’t make anyone do that because it costs extra money. The app calls SQL Server to execute some store procedure and get the results. Then the app builds the LLM prompt, right?

So, like, in this case, I had the store procedure do it because I don’t have an app, right? I’m appless. But you could have an application do that and then, you know, answer the question down here. The LLM generates an answer using all the, like, context from the links that you provided it.

And then the app shows the answer with source links for verification, right? Wouldn’t it be nice if LLM’s always cited sources? So, SQL Server’s job is to find relevant content quickly.

And the LLM’s job is to then synthesize a coherent answer from that content. So, that’s sort of in a nutshell how one would approach RAG with SQL Server. I hope you enjoyed yourselves.

I hope you learned something. And today is Friday, so I won’t see you in tomorrow’s video, but I will see you in Monday’s video, assuming that we all survive, which will be office hours. And then, what am I going to do after office hours?

Well, that week is wide open. Wide open. Who knows what will happen? All right. Thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Capacity Planning

Get AI-Ready With Erik: Capacity Planning


Summary

In this video, I delve into the fascinating world of capacity planning as it pertains to vector embeddings in SQL Server databases. I start by comparing the size of a typical Stack Overflow post table from 2010 to its corresponding embeddings table, revealing just how much space these embeddings can consume—eight and a half gigs compared to six and a quarter gigs for the original data. This comparison sets the stage for an exploration of vector sizes, which depend on the number of dimensions in your chosen model. I also discuss the implications of different models and their embedding sizes, touching on both theoretical maximums and practical considerations. Additionally, I explore how vector indexes work differently from traditional B-tree indexes, emphasizing the importance of columnstore indexes for optimizing space usage. Finally, I provide some practical tips for managing database space when working with large embeddings, including the use of clustered columnstore indexes to achieve significant savings.

Chapters

Full Transcript

Erik Darling here with Darling Data. Boy oh boy, exciting times are upon us. We’re going to talk about a DBA-ish subject here, but who knows, maybe someday a developer will care about this sort of thing as well, and that is capacity planning. Because vectors themselves, the embeddings, can be rather large. And what I’m going to show you is You know what, actually, I’m going to change this on the fly a little bit, because I want to show you actually just how big embeddings can get. And I don’t know why it didn’t occur to me to do this at the beginning. Posties. I’m going to compare the size of the post table to the size of the post embeddings table. Now, this is the Stack Overflow 2010, so this is like not even close to the full-size database. This is like the first like two, three years of Stack’s existence. The post table with the title column, the tags column, the bodies column, body column, not bodies, like last editor, like a whole bunch of string columns and stuff in there. The whole thing, right, is about 6.4 gigs. The post embeddings table with far fewer columns in it, just having the embeddings in there is eight and a half gigs, right? Like, it’s no joke, size-wise.

And, like, one way that you can sort of figure that out is, like, we’re going to take, it’s like 8218. We’re going to take that number, and we’re going to just sort of, like, try to do a rough projection here. Now, remember, post embeddings is about eight and a half gigs, like, full-size up here. If we just sort of look at different row counts and how big they might be, that still doesn’t get us close to, like, we’re still, like, about a gig off for all of those different row counts, right? And, like, part of that is going to be, like, well, like, the post embeddings table has the embeddings column in it, which is the majority of the space, but there are a few other columns in the table.

Nothing huge. It’s, like, ID and, like, post ID and, like, a date time to column or something. So, like, if you look at the table itself, like, those embeddings take up the majority of it, and, you know, every embedding looks like this. It’s, you know, 1024 floaty numbers jammed into a JSON string.

So, you know, not kind of storage. If you’re a storage vendor out there, God bless. God bless. I’m in the wrong line of work, probably.

But vector size is going to depend on the number of dimensions that your embeddings are, and the number of dimensions in your embeddings is going to depend on the model you choose. There are all sorts of different models out there that generate different numbers of embeddings.

The max number of embeddings that you can use for a float32 data, float32 vector in SQL Server is 1,998. That doubles for float16, but float16 is still a preview feature. So, yep, ba-ba-ba-ba-ba-ba-ba-ba-ba.

Here we are back at square one. So, if we look at this and we look at what different models might generate as far as size goes, you know, like 768, you know, theoretically, you know, you know, like, I don’t even know how many gigs that would be.

But this is, you know, like, they get bigger as you involve more embeddings up until you hit, you know, 8,000 actual bytes and when you hit the 1,998 vectors for a float32. But I don’t know, at least no model that I’ve ever seen uses specifically 1,998.

If you’re using, like, OpenAI, you’re going to have, like, the 1,532 or whatever that you see in the majority of the Microsoft demos because, gee, I wonder why they’re all using OpenAI, self-talking, whatever. Anyway, what’s cool about vector indexes, though, sort of, it’s like, it’s not like a B-tree index, right?

Because it’s not like you’re taking all the data and just putting it in order, right? So, it’s like you’re making a graph out of it and you’re sort of connecting lines within that graph. That’s why the vector index create process looks the way it does.

If you go back a few videos to where I showed you the stuff that Microsoft is running when they build a vector index, like, you’re not just, like, taking the data and putting it in, like, B-tree order. It’s not like you’re seeking into, like, a regular B-tree and just being like, oh, this is my entry point. I found this thing. I seek exactly to this primary key vector row.

It’s not like that at all. It builds a graph and the graph structure itself is a lot smaller than the vector embeddings are. Like, you can see the size of the edge table, right, is 286 megs, which is much, much smaller than the size of the embedding data that we have in the post-embeddings table.

If you want to control embedding sizes, like, outside of just which model you choose, right? Like, if you choose a model that’s, like, 1,024 or 1,500, whatever, you know, like, you’re going to have some pretty big vector data.

Like, columnstore works. Like, you can create a, I mean, non-clustered columnstore is, like, it’s okay, but, like, you still have the base table, which is, you know, it can be like a rowstore clustered index or something, and that’s not going to be very well compressed.

So, like, clustered columnstore, you can create that on your table that has the embeddings in it, and columnstore will compress it pretty well. Row and page compression don’t do anything for vectors, right? That is not the type of data that you can actually compress, because it’s, like, the same reason you can’t compress, like, JSON or XML, or, like, you have compressed XML indexes in SQL Server, whatever, but, like, if you just have, like, an Envarkar max or something, like, that doesn’t get compressed.

The only thing is you can’t intermix columnstore and vector indexes. If you have a table with a columnstore, like, a clustered columnstore index on it, or I think even non-clustered columnstore, too, you can’t create a vector index on it.

And if you have a vector index on a table, you can’t create a columnstore index on it. They just don’t, they don’t interoperate. That’s not super important today, since vector indexes are still preview only. The only generally available stuff is float32 and vector distance, and a few of the other, like, AI generate chunks and, you know, things like that.

So, you may as well, at this point, just get the space savings and maybe some batch mode in your query plans by creating the clustered columnstore index on it.

Anyway, just a little bit of stuff about the capacity planning there, a couple hints on how you can maybe save some database space by using clustered columnstore.

But anyway, that’s it there. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something. And I hope that you will buy this course, Get AI Ready with Erik, available today with a coupon code AI Ready, which takes 100 entire dollars off the cost of the course.

It makes it well worth your time and money if your company is planning on doing anything with vectors in SQL Server. All right.

Well, that’s probably good there. Thank you for watching. Goodbye.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Bad Vector Observability

Get AI-Ready With Erik: Bad Vector Observability


Summary

In this video, I delve into the world of bad vectors and embeddings, illustrating what they look like through real-life examples and practical scenarios. Drawing from my experience attempting to transcribe, summarize, and chapterize a large YouTube catalog using local Language Models (LLMs), I highlight how imperfect these models can be, leading to issues such as overly long chapter timestamps that don’t match the actual video length. This example underscores the need for robust validation mechanisms to ensure embeddings are of high quality before they’re stored or used in any pipeline. I also explore techniques like using dot product arguments and vector distance functions to identify problematic vectors, emphasizing the importance of catching these issues early on to maintain the integrity of the data processing workflow.

Chapters

Full Transcript

Erik Darling here with Darling Data. And what I want to show you in this video is a little bit about what bad vectors look like or what bad embeddings look like. So it would be stuff that like, like just, you know, like this can happen for a lot of reasons, right? It can happen during like, like, like while you’re doing the embedding, something weird can happen. There could all sorts of like, I don’t know, you can even have like weird truncated text, truncated text that does it. But let’s just like, just, I’m going to give you like sort of a, a, an example from my real life. Like, I don’t know if you, I forget when I talked about it, but like one of the things that I said I was trying to do was take all my YouTube catalog, uh, and like have it transcribed, summarized, and chaptered by using local LLMs. So like I have this pipeline set up to like download the YouTube video, use one local, one local LLM to make the transcription. And then another LLM to look at the transcription to generate the summary and chapters. Now what’s, what’s, what was really interesting, like a funny thing that happened that I didn’t catch until like, like there was a fair amount of scrutiny going on where I was like, wait a minute, like, like one LLM generates the transcript and then another LLM looks at the transcript and it summarizes it. And the summaries were generally okay. They were a little repetitive, like in this video I delve into or this video I dive into, like whatever, it doesn’t matter.

But what was really interesting was the chapters. The reason the chapters were interesting is because the LLM that looked at the transcript at first, like until I put it in there, it had no idea how long the video was. So I would have like a 10, 12, 15 minute video and the local LLM would start putting chapters at like an hour, hour and a half, two hours out. And I was like, that’s not good. Someone’s going to look at that and think I’m crazy. This is a 15 minute video. And it’s like three hours in, good night. Okay. So like, like, like, like I’ve been going, like, I’ve had to like redo a lot of stuff because of that. Uh, I spent yesterday with my Mac book on drink coasters with, um, like, like, like ice sleeves for pain under it. Cause it was getting hot.

And it was just the whole thing. Anyway, it was like seven hours of reprocessing 700 videos or something. But, uh, like, like, like, like, you know, again, something that we can all agree on. LLMs are currently imperfect. Um, pipelines are also, also somewhat imperfect. Um, you know, computers quite imperfect. So there are a lot of potential reasons why you might need to like deal with and find things that look like this. So like one, one way you can do that is by using the dot, um, dot product argument for the vector distance function. And you can generally use about these numbers to find vectors that would not be vectors or embeddings that would not be good. Like, thankfully I don’t have any of those. Right. So like everything in here is not like messed up, all zeros, very weak, you know, like kind of the same low numbers all across.

Uh, what I want to show you is what happens when we mix like kind of good ones and bad ones together. And so I’m just using some literal values here to like, you know, like show like at least some like, okay stuff and then some bad stuff where it’s like all zeros and whatever, and then a no one finally. And we can use some fancy queries to sort of categorize those and find ones that are not good. Right. So stuff like this, uh, where we have five rows that are very like near zero. Zero magnitude, almost like, like potential zeros, one row that’s okay. And one row that’s no. And we can use another sort of fancy query to find, um, to find ones that like just to get the detail on that, that was an aggregation.

So like when dot distance is zero, that’s probably not a good sign. Um, when it’s like a number like this, probably not a good sign, negative four, not a good sign. Generally you want to see like where it’s okay, like negative one and close to, close to negative one. Um, and then, uh, and this is like, again, this is the difference between itself. I’m not talking about dot product between like two different vectors. I’m talking about like, like when you say dot product between like, like when you compare a vector to itself, right? Or compare an embedding to itself, not when you compare it to something else, comparing it to something else is completely different. Comparing it to itself is what we’re looking at here.

And like, like the reason why you would care about this is because you might, you know, you might have like the text of the document, uh, somewhere in your database. Uh, and you might also have the embedding for it. And if you like, if you’re looking at the text document and it’s like, you know, it doesn’t matter if it’s like one line or if it’s like a couple paragraphs or a long document, if it’s all zeros, it’s never going to match to anything, right? Like it’s just not going to come up as being similar to anything. So that’s what you have to be really careful of is like, cause it just makes the LLMs look worse than they actually are.

So I’m going to create a different, slightly different table here called bad embeddings. Again, kind of using the same setup with like, you know, like some okay ones and some not so okay ones. And if we just run a query against this and we look at what comes back, like we get like, of course, like the good match and the great match come up on top, but like then like, you know, like weak matches and noise aren’t too far behind. Right. And so like, again, like, like other videos have talked about being careful about filtering with this and like saying, hey, like vector distance is less than like 0.2 or like 0. whatever.

So you can like sort of get rid of stuff like this. But like, if you were expecting good matches from some of these, you might be pretty surprised when you don’t get them. Right. And that’s going to be based on just like the vectors being, the embeddings being messed up. So like, if we, if we just run a couple of queries like this, we’re going to say vector distance not between minus 1.05 and 0.95.

And then between 1.05 and 0.95, then we’ll see those like two different result sets kind of come in. Right. So like, again, good match and great match ended up in here. The bad ones ended up in here. So this is one way of sort of catching and filtering out like bad embeddings by comparing them to themselves. Right.

Because again, we’re just, this isn’t two different embeddings. This is the same embedding. And we’re just saying, hey, how much do you agree with yourself? Right. How strongly do you reinforce yourself?

So you might be now thinking like, what are ways that we could validate vectors as they come into the database so we can catch this stuff? And, you know, sort of unfortunately, this is, this is a real bummer. Like at least, again, as things currently exist, I’m on CU1 of 2025. Right.

I know it says RTM down like here, but I’m on CU1. RTM doesn’t update to say CU1. Thanks, Microsoft for that making me look dumb. But like, you might think that if we created a computed column to say, look at like each vector and then we could like add a check constraint to say, like, if you’re messed up, I don’t want you on my table.

I mean, it doesn’t work. Right. We can’t persist this because vector distance is non-deterministic. Right. So that’s messed up.

And of course, if we take out the persisted. Right. And we say, OK, well, you know, no persisting. That’s OK. Then we get a different error and we can’t create the check constraint on it specifically because it is not persisted. So this messes us up further.

So kind of you’re stuck a little bit with like a trigger. Right. And I’m just going to give you a simple after insert update trigger where real life would be like, and instead of update trigger and, you know, you would insert it into like insert good rows and you would like into the real table and you would insert bad rows into a logging table and be like, I need to reprocess you.

But one way you can do that is with a trigger and just say, hey, if you’re not between these magic numbers that I care about, you’re going to get out of here. So this one passes and this one fails. So that’s just one way of sort of protecting yourself from bad vectors getting in for whatever reason.

Again, LLM failures and coding failures, embedding failures, all the all the stuff that can happen. And if you’ve ever dealt with any sort of ETL pipeline before in your life, you’re not probably or any sort of import process, anything like that, you’re going to be no stranger to things like this and having to use some mechanism of like capturing bad stuff and like logging it and like saying, I need to reprocess this and letting the good stuff in.

Anyway, that’s about it there. Thank you for watching. I hope you enjoyed yourselves. I hope you learned something and I will see you over in tomorrow’s video. Thank you.

Thank you. Thank you. I hope that at some point you learn to love me. Am I cool yet? Ah, screw it.

All right. Thank you.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Stay In The Loop, Not In The Middle

Credits and Debits


This post is inspired by my Humanoid BFFL Joe Sack’s wonderful post: Keep Humans in the Circle.

It’s an attempt to detail my progression using LLMs, up through how I’m using them today to build my free SQL Server Monitoring and Query Plan analysis tools.

While these days I do find LLMs (specifically Claude Code) to be wonderful enablers for my ideas, they still require quite a bit of guidance and QA, and they’re quite capable of (and sometimes seemingly eager to) wreck your day.

Just in the past couple weeks, I’ve had Dear Claude:

  • Drop databases
  • Git checkout and lose a bunch of work
  • Merge to main when it had no business doing so
  • Write a bunch of happy-path tests to make bad code pass
  • Delete inconvenient data to make a query work
Claude Drops a Database
Claude Drops a Database

This is along with a bunch of side-annoyances, of course. It’s all in my local environment, so no one’s getting hurt, but it’s still a lot of recovery work for me at times. You should probably not let it go beyond that.

Even Amazon agrees.

Dear Claude has skills and a Claude.md file to give it persistent instructions to do things, but it still has quite a habit of not following them. Like they’re not even there. But they are there, Dear Claude.

Sometimes it can’t figure out how to query DuckDb without being blocked, and other times it will run SQL scripts to manually patch things rather than use the dashboard Installers to make sure those work properly (in my Monitoring tool, there’s an Installer meant to update procedure code and table schema on upgrade).

Manual patches are supposed to be verboten, but…

In other words, it’s a lot like hiring a junior developer who doesn’t really always listen to you, or know better, and then hiring a brand new junior developer every time the conversation compacts.

Hopefully the new one million token context makes this all a bit less painful. Time will tell.

Conversations Gone Wrong


I’ve been working with SQL Server for a long time. Too long, some might say.

Long enough to have quite particular and occasionally profound opinions about it, and long enough to be skeptical of anything that promises to make SQL Server easier at the level I work at.

Even SQL Prompt, which I like quite a bit, leaves me with rather a bit of work to (un)do at times. Perhaps it needs a .md file to learn from. Perhaps if it doesn’t get one, I’ll make something that does. The world is a fun place these days.

When LLMs showed up and everyone started losing their collective minds over them, I did what any reasonable person would do: I tried to lose my mind over them, too.

What followed was a months-long journey from complete, laughable frustration, to settling into a more productive use-case: Them being my little enablers.

The first thing I tried was the most natural thing in the world: having a conversation. About SQL Server. My beloved.

LLMs are supposed to be good at conversation, after all. This should be a meeting of the minds, right? All their years of training data, all my years of training data, together at last. And me with no one to talk to.

Wrong. Wrong, wrong, wrong. So very, horribly, totally, utterly, and terribly wrong. Good golly.

Whenever the monthly model updates roll out and the usual spate of hucksters talk about the most advanced, doctor-level reasoning capabilities, I laugh until it hurts.

I’d ask something like “what causes parameter sniffing problems with local variables?” and get back this enthusiastic, confident, completely wrong answer. The LLM would happily tell me that local variables cause parameter sniffing.

That’s all backwards, ‘natch. Local variables circumvent parameter sniffing, which is its own problem because you end up with the average density estimate guess instead of a sniffed value. This is often rather an unattractive proposition for skewed data distributions.

It got worse, too. I’d ask it about topics I know fundamentally quite well, and get back these half-cooked answers that sounded more like someone went to a one hour talk about a subject and came away with a lot of wrong ideas about what was said. Deeply incorrect in important places and unable to explain further.

World’s shortest interview.

Here’s the kind of exchange I’m talking about:

-- Me: Why might a parallel query with a large memory grant
--     still spill to tempdb?
--
-- LLM: A query may spill to tempdb when the memory
--      grant is insufficient for the data volume
--      being processed. This typically happens when
--      statistics are outdated and SQL Server
--      underestimates the number of rows...
--
-- Me: What about when the grant IS large enough
--     based on the estimate, but the distribution
--     is skewed?
--
-- LLM: Great question! When data distribution is
--      skewed, you should update statistics with
--      FULLSCAN to ensure accurate cardinality
--      estimates...

That’s not even wrong in an interesting way. It’s idiot advice like you’d find on LinkedIn. I won’t even dignify it.

It’s wrong in the “I’m going to confidently repeat the first three results from a search engine” way.

The for-realsies answer involves how memory is allocated to individual threads during execution, and how thread memory allocation can cause spills even when total memory is fine.

Remember that all plans start as serial plans, and that’s when the memory grant is assigned. If a parallel plan is chosen, the memory grant gets divided up equally amongst DOP threads. If one thread gets many/all the rows, it’s likely that division of memory will not quite be adequate.

But the LLM had no idea about any of that. It just kept cheerfully, wantonly suggesting I update my statistics, like that’s the answer to everything.

At least it didn’t tell me to do index maintenance, which is a step up from some people.

Complete waste of time for expert-level SQL Server topics.

I’d rather talk to a cat.

I am allergic to cats.

Getting Organized


One thing I’m terrible at is dull admin work.

Things like building course outlines, writing read me files, and keeping various things up to date as they change along the way.

Documenting things is not my idea of a good time.

I’d often try to use them for that, and walk away sorely disappointed. Much like the above exchange, the LLM would build course content that is unusable.

  • Page splits, the silent killer
  • Logical reads: why your queries can’t have nice things
  • MAXDOP 1 or MAXDOP None: Which is worse?

Okay, cool, let me get in a time machine back to 2008 so any of this will be relevant.

I would sooner die, dear reader.

For the documentation bits, it would just hallucinate things that the code had never, and would never do. For example. it kept insisting that sp_IndexCleanup checks index fragmentation. Dawg is you [bleeping] kidding me?

For the training outlines, I mostly just binned them all. There were some okay fundamental ideas, and progression paths, but the details were a nightmare. How fast is a logical read, anyway?

For the read me files, there was a lot of manual labor fixing them.

Sure it was nice to have something that knew markdown, and could make things pretty for me, but having to give all the text a correctiondectomy was quite the opposite of good.

But hey, it gave me a starting place, and I could work from there. That’s more than I started with.

Writing Queries: Sorta Better, But Still Painful


Okay, so it couldn’t talk about SQL Server competently. But could it write SQL?

Can I finally have something bang out a bunch of queries I don’t feel like writing? Would any of them make good demos, or be logically correct?

I figured the bar was lower here. I’m not asking it to understand optimizer internals. I’m asking it to write a SELECT statement that doesn’t error out.

And it kind of could. In the way that a junior developer who just finished a SQL boot camp kind of can. And the way an ORM kind of can.

Sure, it starts with SELECT, and it has mostly the right table names, and it has all the columns (most of them even existed, too), but…

It still seemed to not understand data very well.

-- Write a query to find the top 10 customers
-- by total order amount for orders placed
-- in the last 90 days
--
-- Tables:
-- dbo.Customers (CustomerID, CustomerName, Email)
-- dbo.Orders (OrderID, CustomerID, OrderDate, TotalAmount)

And I’d get back:

SELECT TOP 10
    c.CustomerName,
    c.Email,
    SUM(o.TotalAmount) AS TotalOrderAmount
FROM Customers c
INNER JOIN Orders o
    ON c.CustomerID = o.CustomerID
WHERE DATEDIFF(DAY, o.OrderDate, GETDATE()) < 90
GROUP BY c.CustomerName, c.Email
ORDER BY TotalOrderAmount DESC

Which is… fine. It works. Looks like crap, though.

But look at it. If a person handed that to me, they might lose that hand. And database access.

For more complex queries, it would break down faster than hostage negotiations for an ex-wife.

I asked it to write a running totals query with a proper window function. I do enjoy a proper window function.

-- What I asked for:
-- Running total of sales by month, reset per fiscal year,
-- with a percentage of the annual total so far

-- What I got back:
SELECT
    FiscalYear,
    MonthNum,
    MonthlySales,
    SUM(MonthlySales) OVER (
        PARTITION BY FiscalYear
        ORDER BY MonthNum
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) AS RunningTotal,
    MonthlySales / SUM(MonthlySales) OVER (
        PARTITION BY FiscalYear
    ) * 100 AS PctOfAnnual
FROM MonthlySalesData
ORDER BY FiscalYear, MonthNum;

Spot the problem?

That PctOfAnnual calculation is the percentage of the full year, not the percentage of the annual total so far.

The denominator needs to be the running total, not the total-total.

Look, I’m not great at math, and I can see this is a problem. It’s not that me or another human couldn’t or wouldn’t mess this thing up too, but these robots are supposed to be the almighty beings replacing us all in the workforce.

When I pointed this out, it apologized profusely, and gave me back a version where it replaced the denominator with the running total expression, but didn’t take much pride in its efforts generally.

The formatting is an atrocity, and it didn’t really do much for making sure data types were constrained on the expressions. Again, very human laziness on display.

Useful for scaffolding simple queries, but I’d spend almost as long fixing its output as I would writing it myself.

It quite reminded me of this old Dennis the Menace episode:

Uncle Ned asks Helen to plant some bulbs, and she plants all 120 of them all upside down.

In many of these scenarios, I played the part of Uncle Ned. I just gave up and waited until next year.

New Stored Procedures: Guts and Bones and Bloody Knuckles


Writing a stored procedure from scratch is tedious work.

  • Error handling
  • Parameter validation
  • Debug modes
  • Version detection
  • Lots of dynamic SQL
  • Working out #temp table definitions

I thought maybe the LLM could handle that part, and I’d fill in the more detailed knowledge.

Build me a building. I’ll do the decorating. I’ve got quite the fabulous eye.

The template it produced wasn’t that bad. The formatting was still rotten garbage, but that’s what SQL Prompt is for.

CREATE OR ALTER PROCEDURE dbo.ProcessCustomerOrder
    @CustomerID int,
    @OrderDate datetime2(7) = NULL,
    @OrderItems dbo.OrderItemType READONLY,
    @NewOrderID int OUTPUT
AS
BEGIN
    SET NOCOUNT ON;
    SET XACT_ABORT ON;

    IF @CustomerID IS NULL
    BEGIN
        RAISERROR(N'@CustomerID cannot be NULL.', 16, 1);
        RETURN;
    END;

    SET @OrderDate = ISNULL(@OrderDate, SYSDATETIME());

    BEGIN TRANSACTION;

    BEGIN TRY
        INSERT dbo.Orders
        (
            CustomerID,
            OrderDate,
            StatusID
        )
        VALUES
        (
            @CustomerID,
            @OrderDate,
            1 -- New
        );

        SET @NewOrderID = SCOPE_IDENTITY();

        INSERT dbo.OrderItems
        (
            OrderID,
            ProductID,
            Quantity,
            UnitPrice
        )
        SELECT
            @NewOrderID,
            oi.ProductID,
            oi.Quantity,
            oi.UnitPrice
        FROM @OrderItems AS oi;

        COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH
        IF @@TRANCOUNT > 0
        BEGIN
            ROLLBACK TRANSACTION;
        END;

        THROW;
    END CATCH;
END;
GO

That skeleton is pretty decent. SET NOCOUNT ON and SET XACT_ABORT ON, proper TRY/CATCH with a transaction check before ROLLBACK, SCOPE_IDENTITY() instead of @@IDENTITY.

It’s clear the training data included some moderate quality SQL Server content. Perhaps it’s the first thing to fully read Erland’s post.

But then I’d ask it to add business logic:

  • Inventory checks
  • Discount tier calculations
  • Shipping cost logic with regional rules

And that’s when the great Pear Shapening would commence. I’m giving some mock examples, because if I just gave examples from my analysis stored procedures, they’d be largely unrelateable to most readers.

For me, there was just too much fixing to make the scaffolding make sense. Reminding it of things, asking it to include or add things, having that break other things.

Again, not saying a human (even me) wouldn’t have these problems, but this is what’s phasing us out?

Ho hum. That’s nice. Close laptop. Go for walk. At least a bartender will still talk to me.

The worst part was when it was almost right. A procedure that works perfectly in testing, but has a little concurrency issue under load, or that produces wrong results when a specific combination of parameters is passed in.

These are the bugs that cost you a lot of confidence in the robots.

What About Existing Code?


This was where things started to shift a bit.

There was still great suffering, and lots of bugs I had to find out about on my own later, but… at least it had a good example to start with, and it kinda figured out my formatting/style preferences. Less running for the Pepto Bismol than before.

Instead of asking the LLM to create something from nothing, I gave it existing code and asked it to make changes.

  • Add a parameter (and keep the @help and readme up to date!)
  • Refactor a query that would otherwise be a lot of tedious text moving
  • Add error handling to a procedure in multiple places

Having existing code as context made a noticeable difference.

The LLM could see the style, the naming conventions, the patterns already in use, and generally try to match them.

-- Me: "Add a @Debug parameter to this procedure that,
-- when set to 1, prints the dynamic SQL instead
-- of executing it"

-- Before (snippet):
SET @sql = N'
SELECT ...'
+ @where_clause;

EXECUTE sys.sp_executesql 
     @sql, 
     @params, 
     @StartDate = @StartDate;

-- What I got back:
-- Added @Debug bit = 0 to the parameter list

IF @Debug = 1
BEGIN
    PRINT @sql;
    RETURN;
END;

EXECUTE sys.sp_executesql 
    @sql, 
    @params, 
    @StartDate = @StartDate;

That’s reasonable. It understood the intent, put the debug check in the right place, and all the other many right places all at once.

But, like, I didn’t want to RETURN there. Or anywhere. There was so much more stored procedure left to run. These types of unforced errors are quite common.

Try-hard robots.

And the more complex the change, the more it struggled.

I asked it to refactor a 400/500-line procedure to use temp tables instead of table variables, and it made the swap but missed that one of them was a TVP and that kinda wasn’t all that cool to deal with.

Turns out LLMs are about as aware of TVPs as your average SQL developer. At least there’s some consistency in the world, I suppose.

The back-and-forth was still frustrating.

I’d point out a problem, it’d fix that but introduce a new one.

I’d point out the new one, it’d fix that but regress on something else.

Three rounds in, I’m basically re-reviewing the entire procedure from scratch anyway.

Again, playing the part of Uncle Ned. Poor Uncle Ned. All 120 bulbs.

Bug Hunting And Code Reviews


Here’s where things got a bit better.

Instead of asking the LLM to write code, I started asking it to read code and tell me what was wrong with it.

Turns out, LLMs are way better at critique than creation. Which, honestly, tracks.

I am also better at critiques than creation at times. Especially when someone else has to act on the critique.

-- Me: "Review this procedure for performance issues and bugs"

-- The procedure had, among other things:
SELECT
    o.*
FROM dbo.Orders AS o
WHERE o.OrderDate BETWEEN @StartDate AND @EndDate
AND EXISTS
(
    SELECT
        1/0
    FROM dbo.OrderItems AS oi
    WHERE oi.OrderID = o.OrderID
    AND   oi.ProductID IN
          (
              SELECT
                  ProductID 
              FROM @ProductList
          )
)
OPTION (RECOMPILE);

The LLM flagged several things, and they were actually decent:

-- LLM's review points (paraphrased):
--
-- 1. SELECT * pulls unnecessary columns, can prevent
--    covering index usage
--
-- 2. BETWEEN on datetime: if @EndDate is '2024-01-31',
--    rows at exactly midnight are included but nothing
--    after. Use >= @StartDate AND < @EndDatePlusOne
--
-- 3. The table variable @ProductList won't have stats,
--    so the optimizer will estimate 1 row regardless
--    of actual content. Consider a temp table.
--
-- 4. OPTION (RECOMPILE) on every execution: is this
--    intentional? If this runs frequently, you're
--    paying compilation cost every time.

Not all of it is earth-shattering, but it’s the kind of thing a DBA might flag. The BETWEEN on datetime point is one I see people get wrong constantly.

The table variable statistics point is correct and relevant, though it kinda lost that the recompile hint would get at least table cardinality. It’s nice that it was in the right neighborhood.

It’s sort of like if Helen had planted the bulbs a bit sideways instead of upside down.

I could take each of those points, decide if I agreed, ask to see a proposed fix, and then evaluate the fix. I’m in control. Mostly.

The big difference was that I was the one making decisions. The LLM was providing input, not output. It was a reviewer, not an author.

For actual bugs, it was even better:

-- Me: "This procedure sometimes returns duplicate rows.
--      Can you figure out why?"
-- [insert 200-line procedure here]
--
-- LLM: "The JOIN between #TempOrders and
--       dbo.OrderHistory on OrderID can produce
--       multiple matches when an order has been
--       modified more than once, since OrderHistory
--       keeps all versions. You probably want to
--       filter to only the most recent history record
--       per OrderID, or use a ROW_NUMBER() window
--       function to deduplicate."

That’s exactly the kind of answer I want quickly. It identified the specific join, explained why it produces duplicates, and suggested two reasonable fixes.

I didn’t have to go through six rounds of “no, that’s not it” with more and more all caps and curses being introduced before getting somewhere useful.

The only time I want to go through six rounds without being useful is at happy hour.

The Lightbulb Cometh


For all of the previous stages, I was the bottleneck.

The LLM would produce something, I’d review it, find problems, describe the problems in English, the LLM would try to fix them based on my description, ad nauseum, alpha and omega, prompt without end, amen.

It was like pair programming where my partner can’t see the screen. There are also a variety of cooking competition show similarities here. Probably something like two people trying to cook the same dish without being able to see what the other is doing, or something.

The question that changed how I work with LLMs was: Why am I the middleman, here?

I gave Claude Code access to sqlcmd, connected it to a development database, and instead of asking it to write a query and hand it to me to test, I told it to write the query, run the query, look at the results, and figure out if they were right.

At least as much as it could figure out if the results were right.

This is where the happy-path code tests tend to happen, but you learn to catch those and tell the robot to act more adversarial pretty quickly.

The difference was pretty good.

-- What I used to do:
--
-- Me: "Write a query to find customers with
--      declining order frequency"
-- LLM: [writes query]
-- Me: [runs it] "This returns 0 rows, that
--      can't be right"
-- LLM: "Oh, let me fix the date logic..."
-- Me: [runs it] "Now it returns every customer"
-- LLM: "My apologies, I had the comparison
--       backwards..."
-- [repeat 4 more times]
-- What I do now:
--
-- Me: "Connect to the dev database. Find customers
--      whose order frequency has declined over the
--      last 3 quarters compared to the prior 3.
--      Run it and verify the results make sense."
--
-- Claude Code: [writes query]
-- Claude Code: [executes via sqlcmd]
-- Claude Code: "Got 847 rows. Let me verify.
--   Checking a sample customer... CustomerID 4521
--   had 12 orders in Q1-Q3 2024 and 4 orders in
--   Q4 2024-Q2 2025. That looks correct.
--
--   But I notice the query doesn't account for
--   customers who are brand new in the recent
--   period. Let me add a filter for minimum
--   account age..."
--
-- Claude Code: [modifies and re-executes]
-- Claude Code: "Refined to 612 rows."

It went from six rounds of me squinting at code to one round of it figuring things out on its own.

And it caught a problem I hadn’t even thought to mention  (the new customer issue) because it could actually see the data.

This pattern became my default workflow for anything involving existing procedures:

-- Me: "The procedure dbo.CalculateMonthlyRevenue is
--   returning numbers that are about 15% too high
--   for January 2025. Figure out why."
--
-- Claude Code:
--   1. Reads the procedure definition
--   2. Runs it with test parameters
--   3. Breaks the procedure into pieces, runs each
--      CTE/subquery independently
--   4. Compares intermediate results
--   5. Finds that a JOIN is including refunded orders
--   6. Proposes a fix
--   7. Runs the fixed version
--   8. Compares output to a manual calculation
--   9. Reports back with the fix and verification

All of that happens without me doing anything.

I’m not copying and pasting results.

I’m not describing output in English and hoping the LLM interprets it correctly.

It can just look.

Unit testing became way easier too:

-- Me: "Create a test scenario for dbo.ApplyDiscountTiers.
--   Set up test data covering: no discount, single tier,
--   multi-tier, boundary values, and NULL handling.
--   Run each scenario and verify the results match
--   the business rules in the procedure comments."
--
-- Claude Code creates a temp table with test cases,
-- executes the procedure for each one, compares actual
-- vs expected results, and reports:
--
-- "All 12 test cases passed except case 8: when
--  Quantity is exactly 100 (the tier boundary), the
--  procedure applies the higher discount tier.
--
--  The comment says 'orders of 100+ get Tier 2'
--  which is ambiguous: is 100 included in Tier 2
--  or not? Currently it is.
--
--  If that's wrong, line 47 needs >= changed to >."

That’s the kind of crap that would have taken fourteen rounds of back-and-forth to get to in the old workflow.

The LLM found the ambiguity, tested the actual behavior, cross-referenced it with the comments, and asked me the right questions.

All Good Things


The progression wasn’t about LLMs getting better (though they have, I believe). It was about me figuring out the right way to use them.

Asking an LLM to be an expert in SQL Server internals is asking it to do something it’s bad at.

It’s pattern matching against training data, and there isn’t enough deeply technical SQL Server content out there for it to pattern match well on niche topics. I’d often ask it loaded questions, get a bunch of absolute gobbledygook nonsense wrong answers, point it to a blog post with the right answer, and reap the apologies.

For every accurate article about parameter sniffing, there are fifty Stack Overflow answers that are somewhere between incomplete and wrong and terribly outdated,

Asking it to write code in isolation is slightly better, but you’re still fighting the fundamental problem: it can’t see what’s happening. It’s writing blind. It doesn’t know your data, your schema, your indexes, your edge cases.

So it makes guesses, and you spend your time correcting those guesses.

The sweet spot turned out to be giving it the ability to close the loop. Write something, test it, see the results, iterate. The same way a human developer works.

The LLM doesn’t need me to be its eyes and hands. It needs a connection string and permission to experiment.

I still do the final QA. I still review what it produces before anything goes out the door. I’m not handing the keys to the castle over.

But the amount of work that gets done before I even look at it is dramatically higher, and the quality of what I’m reviewing is way better, because it’s already been through several rounds of self-correction.

If you’re still arguing with chatbots or manually copy-pasting query results back and forth, try jumping ahead. Give it access to a dev environment. Let it run things. Let it fail and fix its own failures.

Stop being the middleman.

Enablement, and Other Drugs


For years, I’d wanted to build a monitoring tool for SQL Server.

Actually, I’m sort of lying. For years, I’d offered to help every single monitoring tool company make their monitoring tools less crappy.

What’d I get back? Crickets, runarounds, and a whole bunch of frustration.

Well, fine. They don’t care. I do. Perhaps I can take away enough of their money for them to someday care. That’s the goal, anyway. Or maybe they’ll just quit.

Either is fine, really.

So for a while, I started trying to talk developer-friends of mine into helping me build something. They’d all think it was a cool idea until the word “free” came up.

I’m not getting paid for anything, here. I’m probably losing work.

In fact, from what I can tell, the MCP tools I built in are doing a great job for people. I’ve gotten a lot of awesome feedback about them. But they’re also doing my job, so hopefully they’re also doing the jobs of lesser consultants, too. Better get back to the factory.

When I realized that developers were developing things using LLMs, I figured I could build off my SQL Server performance knowledge, and Vision Code myself exactly what I wanted. There was no Vibe Coding here, because I knew exactly what I wanted (pretty much, though that has evolved a bit since the initial process began).

I also, like, know what I’m doing and what I want and what’s right and I’m not just doing. a bunch of guesswork to get there.

From the beginning, I knew that I couldn’t skimp on things like security, so I made sure to use Windows Credential Manager from the start. If you don’t trust that, then you should probably uninstall SSMS.

But this introduced quite a new set of pain points with my robot enablers: they can’t run something and look at a GUI.

While they could run a query and validate results, it was me who had to iterate for long periods of time trying to get things visually correct. I’m still working out some things I don’t quite love in that regard, but the list is narrowing, at least. I can take screenshots and show them, but that doesn’t guarantee a good outcome always.

Work Flows


When people open issues, I have Claude review the issue, investigate the code, and come up with reasonable fixes for testing.

  • Does it always work? Nope.
  • Is Claude always right the first time? Nope.
  • Do I often find myself frantically hitting the Escape key to stop it doing something asinine? Yes.

    claude breaking rules
    I don’t believe you, Dave.

But is this something that I could do on my own? Learning WPF? C#? XAML?

Hell no.

The robots have enabled me to see this through, and continue to work on it at a pace that would otherwise be impossible.

And I gotta tell you, for something that’s only been out a couple months, I think it’s competitive-to-better than stuff that’s been around 10 or more years, and those people are charging you a lot of money for the privilege of being annoyed and unhelped by.

I’m going to keep at it until I wear out my Max plan.

But it’s not just this stuff. It has also allowed me to work quickly on:

I can get a lot of stuff done and out into the world that I couldn’t before. Going from nothing to an imperfect something is fine with me. The world is full of imperfect software. At least I’m willing to fix mine.

An imperfect something is something that can be improved. Nothing is still nothing, no matter how much you think about it.

If you’re out there reading this, I’d encourage you to give something like Claude a crack at helping you build something that has always seemed too difficult, tedious, or involved for you to personally get started on.

It’s going to take some time and patience, and you’re going to have to supervise the process and output, but you’ll have something that used to be nothing.

Thanks for reading!

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.

Get AI-Ready With Erik: Vector Query Plans

Get AI-Ready With Erik: Vector Query Plans


Summary

In this video, I delve into vector material and query plans, specifically addressing common pitfalls when working with vectors in SQL Server. I share insights from the full course on vector search and indexes, which you can access at training.erikdarling.com using the coupon code AIREADY to save $100 off. The key takeaway is to avoid putting vectors into a temp table; instead, use local variables or formal parameters for better performance. I also highlight how query plans can behave unexpectedly when dealing with vector distances and nested loops joins, emphasizing the importance of parallelism and memory grants in SQL Server’s execution strategy.

Chapters

Full Transcript

Erik Darling here with Darling Data. And we’re going to finish out this week with sort of the last of the vector material. Now, two things about that. One is that this was not all of the content from all of the modules in the class. There are a few that I did not include in this video series. And the other is that this is not nearly, this is like a fifth of the content that’s actually in the full course. If you want to buy the full course, which hopefully some of these videos will persuade you to do. Hopefully I am persuasive in my tactics. You can go to training.erikdarling.com. And if you use the coupon code AIREADY, you can get AI ready with Erik, question mark. And you can get a hundred bucks off the course with that. We’re going to spend some time looking at query plans and solving vector stuff. I know that we’ve sort of peaked at them a little bit, but there’s still some fun stuff to talk about. And there’s, there’s actually one thing in here that I want to hopefully get you to not do when you’re doing that. And I only learned that lesson when trying to make the demos a bit more portable. So the thing that I’m going to try to persuade you to eventually not do is to put your vectors into a temp table and use that. And there’s a good reason for that. Now the first thing here is kind of a funny query plan quirk. I mean, the funny query plan quirk is we very clearly have a where clause on vector distance here. We have a cross join between these two tables here. And when we run this and we look at the execution plan, something kind of funny happens. The vector predicate is applied here at the nested loops join. You can see minus two, ba-da-ba-da, all that stuff there.

But this is the reason why I don’t want you to do this in real life. Everything looks okay here, except we have this distribute streams of the round robin variety. And this is not performance tuning content. I do have performance tuning content that goes deeper into this, if you’re quite interested in that. But the thing that you end up with here is that, you know, after the distribute streams, we have thread one with one row on it. And when we come down here, we look at the properties, all those rows over here end up on thread one.

So if I can convince you to not do one thing in this video, that would be to put your embeddings into a temp table and use that, you are far better off at this point in time, at this juncture, using a local variable or a formal parameter to store your embedding in for which to gauge the vector distance between two things on. Now, coming back to the query plan a little bit, you know, it is, you know, we don’t really have a better way of doing this at the moment. It is a cross join. So we do get like vanilla nested loops here. We don’t get apply nested loops.

We can’t get that with other query forms. But, you know, since the only like we’re basically just like saying, hey, I want all these rows and all these rows to match up and then I need to figure out which ones like have, you know, a reasonable vector distance between them, meaning they are reasonably similar. Right. And just for the data that I have in here asking for a vector distance of less than 0.2, remember that cosine is zero to the number two and the closer you are to zero, the better off you are. We want to just preemptively filter out any results that might not be terribly similar.

If we have this query without a filter on vector distance, we need to join into the post table in order to filter on anything else. So if we wanted to, say, compare tags in the post table to some search in order to sort of reinforce the vector distance accuracy a little bit, just by like, like, it’s not obvious from this, but the question, the embedding that I’m searching for is about Git. And so it helps, it helps a bit to look for Git in the tags of the post table, right?

So we can do that here. And of course, we get a bit more of a complicated execution line. This one, this one does not have the weird row imbalance stuff because there are copious repartitions of streams throughout this plan that help rebalance things. So, like, even over here, like, just because we get, we get a pretty good row spread coming out of the post table here.

I’m not even, I’m not saying that parallelism is terribly useful here. Like, for row numbers this low, almost like, it’s kind of dumb, but it makes a good comparison. For row numbers this low, it’s like, why are we even going parallel?

But, like, just because of all that, like, the stuff that ends up over here is usually also pretty well spread out. Like, I’m not saying it’s perfect. Some threads got no rows and others got, you know, weird numbers of rows. But it’s not like the one before where all the rows ended up on a single thread.

That’s, that’s, like, that’s crappy, right? Get your stuff together, SQL Server. But if we, if we, like, wind all that out and we just run this query where we’re saying, hey, just give me the vector distance order by distance.

This is where, the only thing that I really want to point out here is, I mean, we still get all the rows on one thread, right? So, like, this is still the issue here. But since, like, we’re, we’re saying, like, we want to calculate this distance, this is a runtime calculation.

Right? You don’t know this ahead of time. Like, based on this, it’s not even really, like, a good, like, like, computed column you can make. Because you have to, like, you don’t know what the vector is going to be.

It’s like, okay, well, you showed up. I got to use you. So, again, the, the, the cross-join to a temp table thing is not, not a particularly good idea. But it’s also particularly not a good idea because with all the rows ending up on one thread, and it doesn’t happen here, but it is worth warning you about.

With all the rows ending up on a single thread, you know, something like, again, this is not, like, a big performance tuning focused course. But, you know, all plans in SQL Server start off as serial plans. They only graduate to parallel plans under certain circumstances, the subestimated cost of the query being above cost threshold for parallelism.

There’s no, nothing inhibiting parallelism, stuff like that. And then, finally, the parallel plan has to be cheaper than the serial plan. When SQL Server assigns a memory grant to a query plan, that memory grant is assigned when the query plan is single thread, when it’s a serial plan.

And when it, if it graduates to a parallel plan, that memory grant gets divided equally across dot threads. So, if it was dot beta, we get divided by eight. With all the memory getting divided, and one thread having all the rows on it, it’d be very easy for this query to start spilling and maybe showing some signs of performance degradation because of that.

And you team that up with the fact that vector distance is a runtime calculation, and that’s what we’re sorting by, and there’s no way to index this, like, give that data to SQL Server an order. It can be a pretty dangerous game to play. So, you know, please, you know, avoid the temp table thing for these reasons.

You know, we’re only getting 10 rows up here, but if you go out and get bigger result sets, which probably wouldn’t be too, too common with vector search, depending on what you’re doing, but, you know, like, like, more rows does equal more memory, right? So, we run these two queries, the top 10 one asks for an 1152 KB memory grant, right? And, oops, I didn’t want that.

And the one that asks for the top 1,000 rows or whatever asks for a 700 memory grant. So, memory does graduate with this. Even though we’re not, like, saying, even though we’re not including, like, the embedding column in this, which is, you know, a big old thing, there’s absolutely no reason to look at the embeddings. Like, there’s no reason to return the embeddings to a user and be like, here was your array of floats.

Enjoy. There’s no reason for that. So, don’t do that.

But, if we were to add that to the select list, we would get much bigger memory grants out of these. So, generally, the better option is to assign your embedding to a local variable or formal parameter. I’m just using a formal parameter here to make life easy.

And I’ve written this query. Like, I decided to try and get a little bit fancy with stuff. So, like, rather than, you know, put the vector distance up here, I was like, well, why don’t we cross apply to the top one thing in here? And then we can sort that out out here, and we’ll filter out the original post out here.

Just, you know, throw some mud at the wall and see what sticks. This ends up being a fairly fair query plan to the vector search thing up here, right? Because we get, like, we start with the post embeddings table out here.

We get a very even row distribution here. And then on the inner side, we also get a very even row distribution. And, like, we just avoid the, or rather, not avoid, but we reduce the chance that any single thread will get all the rows.

And the memory grant division for a parallel plan would be less likely to spill. Because you don’t have, like, one thread that has an eighth of the memory with seven threads that have seven eighths of the memory but no rows, right? So, just sort of a better way of doing things.

Of course, vector search is, you know, much, much faster, but it is still in preview along with vector indexes. But, you know, all those query plans took somewhere between, like, 200 milliseconds and, like, 1.2 seconds. This, you know, the vector search thing, you know, it is much faster.

There are trade-offs with it, though. You know, like, again, like I said, like, creating vector indexes is incredibly painful. I was working up some different demo stuff last night, and I was, like, I’m going to make a 10 million row table with embeddings in it.

It took me six hours to create a vector index on that thing. And I was, like, I was, like, I’m going to start this and go to bed. So, and then, like, even, like, a million row table took, like, a half hour.

So, you know, like, it’s a real nightmare at this point. Like, I got, and again, it’s in preview, so you shouldn’t probably be using this stuff, but when it gets, when this stuff finally gets out of preview, it’ll be, make your life a lot easier, like, performance tuning vector stuff. There’s even trade-offs in here, though, because, you know, like, the vector distance thing is an exact, is sort of, like, an exact search where it’s, like, it will give you the exact distance between one vector, one embedding and another, one vector and another.

With vector search, it’s more approximate, because you’re sort of, like, navigating this graph structure in the index, but we’re going to talk more about that in another video. Anyway, thank you for watching. I hope you enjoyed yourselves.

I hope you learned something. I hope that you will buy my wonderful educational content from me at a discount, which you can get the coupon code for that down in the video description. So, thank you for watching.

Going Further


If this is the kind of SQL Server stuff you love learning about, you’ll love my training. Blog readers get 25% off the Everything Bundle — over 100 hours of performance tuning content. Need hands-on help? I offer consulting engagements from targeted investigations to ongoing retainers. Want a quick sanity check before committing to a full engagement? Schedule a call — no commitment required.