Mastering SQL Server Error Log Analysis with sp_readerrorlog

As an experienced full-stack developer, analyzing application and database logs is a daily task for me. SQL Server‘s error logging provides invaluable telemetry, but the volume of data can be overwhelming. This is where the handy stored procedure sp_readerrorlog comes into play.

In this comprehensive 3200+ word guide, you‘ll learn how to fully harness sp_readerrorlog for extracting key insights from SQL Server‘s error logs like an expert.

A Peek Inside SQL Server Error Logging

Before jumping into using sp_readerrorlog, let‘s peel back the covers on how SQL Server actually generates these error logs under the hood.

There are two main components responsible:

1. Error Log Writer Module

This SQLSERVR module handles collecting errors, warnings and messages from across all other modules. It aggregates them, formats them, and outputs them into the current error log file.

2. Error Log Files

These physical ERRORLOG files stored on disk are managed by the Error Log Writer. SQL Server cycles through sequential files, eventually aging out older logs.

Some key facts about these error log files:

By default only 6 files are retained. After that logs are automatically purged.
Typical size limit per file is around 10 MB depending on SQL version.
Growth rate averages around 1 MB per day per database.
Logs use standard text formatting with timestamps and process info.

As you can see, while these logs provide a rich monitoring stream, their impermanent rotating nature means we need smart tools like sp_readerrorlog to analyze them.

Unleashing the Power of sp_readerrorlog

sp_readerrorlog enables efficiently sifting through the mass of server event data by letting us specify filters and tap into historical logs.

Some examples of ways it can be used:

Pinpointing deadlocks and blocking issues
Identifying query plan regressions
Reviewing startup and shutdown events
Spotting memory or disk pressure signals
Correlating front-end app errors to database back-end logs
Calculating usage metrics and growth trends

I rely on sp_readerrorlog daily to parse our mission critical SQL Servers‘ logs and provide development teams actionable error alerts. It‘s an invaluable tool in every DBA and full-stack developer‘s utility belt.

Now let‘s walk through practical examples of how to leverage it effectively in various scenarios.

Retrieving Recent Errors

A common task is reviewing the most recent SQL Server issues and exceptions. We can use sp_readerrorlog to easily retrieve just the latest critical server events rather than browsing full verbose logs.

For example, filtering to only the past hour‘s higher severity errors with:

EXEC sp_readerrorlog 0, 1, ‘Severity 16‘, ‘Severity 17‘, ‘Severity 18‘, ‘Severity 19‘, ‘Severity 20‘, ‘Severity 21‘, ‘Severity 22‘, ‘Severity 23‘, ‘Severity 24‘, ‘Severity 25‘

This filtered query isolates errors rated Warning(16) and higher, ignoring lower priority informational messages.

Here‘s a sample of the output showing a connection issue and deadlock:

LogDate                ProcessInfo  Text

2022-08-22 15:15:23.440 spid58       Logon failed for login ‘webuser‘ due to trigger execution. [CLIENT: 10.10.10.30]
2022-08-22 15:17:44.321 spid62       Process 62: Tran 1, Proc 1: Deadlock encountered ...... Lock escalation occurred to resource database on table xxxxxxxxxx. Batch execution aborted, SQL Server detected a deadlock with itself (children deadlocked). Rerun the transaction.

With focused queries like this, we can get an instant overview of recent activity and take quick action as needed for urgent issues.

Tracking Database Growth Over Time

SQL Server error logs track all databases that are started up along with shutdown events. We can extract and process this data to gain insights into database growth trends.

For example, here is a query that aggregates databases by occurrences to show activity:

EXEC sp_readerrorlog 1, 1; -- get yesterday‘s log 

SELECT DatabaseName, COUNT(DatabaseName) as StartupCount  
FROM
(
  SELECT 
    REVERSE(SUBSTRING(REVERSE(Text), 
      CHARINDEX(‘database ‘, REVERSE(Text)),  
      CHARINDEX(‘.‘, REVERSE(Text)) - CHARINDEX(‘database ‘, REVERSE(Text))
    )) AS DatabaseName
  FROM :result
  WHERE Text LIKE ‘Starting up%‘
) t
GROUP BY DatabaseName
ORDER BY StartupCount DESC;

And sample output:

DatabaseName    StartupCount
------------------------------- 
MyDatabase     223
Accounting     193
LogDB          176  
Reports        137

This shows that MyDatabase had the most restarts due to growth. By tracking this over time, we can identity growing trends needing storage planning.

Debugging Plan Regressions

When queries suddenly run slower in production, one cause can be plan choice changes. We can dig into the error logs to find signals correlating query optimizations with performance regressions.

Let‘s breakdown an example searching for implicit conversions warnings that sometimes coincide with suboptimal plans:

-- Pull latest plan warnings
EXEC sp_readerrorlog 0, 1, ‘implicit conversion‘;

-- Check for any perf drops near same timeframe
EXEC sp_readerrorlog 0, 1, ‘duration‘, ‘convert‘, ‘CPU‘;

This two pronged log analysis highlights any plan decisions impacting performance. By examining plan choice rationale in the logs, we can better understand and address root cause.

Tracking SQL Agent Job History

The SQL Agent has its own separate log capturing details on job execution status, step results and messages.

We can tap into it via @p2 for tracking job outcomes and runtime metrics:

EXEC sp_readerrorlog 0, 2, ‘Job History‘;

This returns the latest job completions with their durations:

LogDate                 ProcessInfo  Text

2022-08-23 00:15:34.123 [JOB136] Job Crashes_DailyCleanup succeeded (Duration: 00:01:32)
2022-08-23 01:15:12.345 [JOB412] Job Rebuild_Indexes succeeded (Duration: 01:11:44)
2022-08-23 01:45:23.123 [JOB392] Job Update_Stats failed (Duration: 00:03:21)

Monitoring job history via the SQL Agent logs provides useful sysadmin context beyond just the SQL Server side.

Comparing Third Party Log Reader Options

While sp_readerrorlog is convenient, it does have some limitations in terms of filtering and processing capacity. Several third party stored procedures attempt to address this by extending functionality.

For example, here‘s a comparison between the built-in sp_readerrorlog and a popular third party option, sp_WhoIsActive from Adam Machanic:

Feature	sp_readerrorlog	sp_WhoIsActive
Filter by date ranges	No	Yes
Search across all logs	No	Yes
Highlight search terms	No	Yes
Save results to table	No	Yes

As you can see, while base SQL Server covers the basics, third party tools like sp_WhoIsActive provide heavier duty enhancements. Often worth integrating side by side with sp_readerrorlog for extra flexibility.

Reporting Key Metrics from Logs

We‘ve covered how sp_readerrorlog can extract detailed events, but what about summarizing high level metrics?

Using SQL Server DMVs, we can derive unified reports spanning the data returned by sp_readerrorlog and current server state.

For example, a query to return a log analysis trend dashboard:

SELECT
  -- Log analysis derived metrics
  (SELECT COUNT(*) FROM :sp_readerrorlog_deadlocks) AS Deadlocks_24Hr,  
  (SELECT AVG(Duration) FROM :sp_readerrorlog_slow_queries) AS Avg_Slow_Query_Time_24Hr,

  -- Current server metrics  
  CONVERT(NUMERIC(18,2), cpu_busy / cpu_total) AS [CPU_%],
  CONVERT(NUMERIC(18,2), io_busy / io_total) AS [I/O_%],  
  CONVERT(NUMERIC(18,2), memory_usage_percent) AS [Memory_%]   

FROM sys.dm_os_sys_info;

Blend this historical log analysis with real-time DMV snapshot data to provide integrated performance reporting.

Best Practices for Log Retention and Storage

Since the error logs provide so much value, careful planning for their retention and storage is important. A few best practice recommendations:

Increase File Rollover Limits

By default, only 6 x 10 MB log files are retained. Increase the max size and max files settings to ensure sufficient history.

Centralize Storage for Analysis

Rather than deleting logs after just a few days, centralize longer term storage to enable better analysis with sp_readerrorlog.

Use Enterprise Third Party Log Manager

For business critical systems, implement a heavy duty log management solution like Dell EMC Centera for compliant centralized storage and automation.

Test Disaster Recovery Access

Validate your DR process can smoothly access archived logs for splicing timelines across recovered databases.

Monitor Log Growth Trends

Watch for surge that can indicate issues like runaway transaction logging. Plot growth via log analysis queries.

With some planning, you can transform logs from transient local files into a vast searchable telemetry archive at your fingertips.

Conclusion

I hope this guide has shown how indispensable sp_readerrorlog can be for both reactive and proactive database analysis. While the logs provide an invaluable stream of rich data, effectively mining insights would be overwhelming without helper stored procedures like sp_readerrorlog.

Whether your goal is troubleshooting blocked queries, spotting index issues, or forecasting storage needs – unlocking the full power of sp_readerrorlog is a must for any DBA or full-stack engineer.

Let me know if you have any other creative applications of sp_readerrorlog I missed! Always looking to improve my own log telemetry skills to build more robust monitoring and alerting for our SQL Server fleet.

Mastering SQL Server Error Log Analysis with sp_readerrorlog

A Peek Inside SQL Server Error Logging

Unleashing the Power of sp_readerrorlog

Retrieving Recent Errors

Tracking Database Growth Over Time

Debugging Plan Regressions

Tracking SQL Agent Job History

Comparing Third Party Log Reader Options

Reporting Key Metrics from Logs

Best Practices for Log Retention and Storage

Conclusion

A Comprehensive Guide to Dropping Indexes in SQL Server

Optimal Strategies for Appending Data to Files with PowerShell

Completely Removing MySQL from an Ubuntu System

Red Hat Enterprise Linux (RHEL) VS CentOS: A Detailed Comparison

The Complete Professional Guide to Updating Raspberry Pi Firmware

C# Arrays vs Lists: A Comprehensive, In-depth Comparison

Linuxhaxor.net – About Open Source & Linux

A Peek Inside SQL Server Error Logging

Unleashing the Power of sp_readerrorlog

Retrieving Recent Errors

Tracking Database Growth Over Time

Debugging Plan Regressions

Tracking SQL Agent Job History

Comparing Third Party Log Reader Options

Reporting Key Metrics from Logs

Best Practices for Log Retention and Storage

Conclusion

Related posts:

Similar Posts

Linuxhaxor.net – About Open Source & Linux