Backup SQL Server Databases in Parallel with PowerShell

Problem

I have a need to decrease the time my SQL Server database backups are taking to run.  I thought about trying to run multiple backups at the same time to see if that would work.  I could have created multiple SQL Server Agent Jobs to run at the same time, but I wanted a more dynamic way to handle this, so I created a PowerShell script that allows processes to run in parallel.

Solution

With PowerShell you can spawn multiple threads to run tasks simultaneously. By implementing this approach using PowerShell, I was able to cut down a process that took over an 1.5 hours to a little over an hour.

Please note that this script was only possible thanks to an article titled “True Multithreading in PowerShell” written by Ryan Witschger (link here).

PowerShell Script to Run Simultaneous Backups

Attached is the full PowerShell script and although several lines are self-explanatory or documented inline, there are others that I want to elaborate.

Variables

Here are the variables in the script:

Variables for PowerShell Script to Run Simultaneous Backups

Most of these have comments or should make sense, but I wanted to point out these:

  • $server – enter the name of the SQL Server instance (for a named instance it should be SERVER\INSTANCE)
  • $database – I am using the master database, because this is where my query executes from
  • $query – I am using sp_databases.  This is a system stored procedure that returns all accessible databases in the instance.
  • $objectExclude – these are the databases I want to skip
  • $objectNameField – this is the column name that has the database name returned from sp_databases.
  • $objectSortExpression – this uses the second column in the result set (0,1,2,etc..) to sort the results
  • $tasks – I will explain this more below.

$tasks

This is a list of actions to be performed on every database that is returned from sp_databases. The database name is put inside variable Args0 which is used in the task scriptToRun parameter.  The tasks I am performing are as follows:

  • backup – this will create a full backup on the local server
  • store – this moves the full backup to an archive location
  • delete – this deletes the local copy of the backup

These are the steps that I do in my environment, but you can configure this based on your needs.

Inside each task we have the following:

  • taskOrder – The first field is the “taskOrder”, note that these don’t need to be sequential (you can comment lines when testing) and you can specify the same “taskOrder” for more than one task (that is going to cause both tasks to run in parallel, like copy to a local and copy to a remote repository at the same time).
  • taskName  – The second field is the “taskName”; this is displayed in a progress bar when you run this script in a PowerShell prompt.
  • maxThreads – The third field is the “maxThreads”, this one allows you to backup multiple databases in parallel, or archive multiple files in parallel, or delete multiple files in parallel. This one needs to be carefully tested, as it is going to consume more resources (CPU, memory, network) when multiple tasks are running at the same time.
  • scriptToRun – The last field is the “scriptToRun”; you must take into account that the command you enter is going to be run as a separate program (so you can’t use variables defined anywhere except the one sent from line 52).

Let’s take a closer look at the first task I have setup for the backup.

The values I am using are:

  • taskOrder = 1
  • taskName = backup
  • maxThreads = 1 (note: if I want to run several backups at once, I need to change this to a higher value)
  • scriptToRun = sqlcmd -Q “BACKUP DATABASE [$args0] TO DISK=N”C:\Backup\${args0}.bak” WITH INIT” -r0 -b

For the scriptToRun I am using sqlcmd to run a BACKUP DATABASE command. Here are the dynamic values used.

  • [$args0] = this is the database name that is being passed
  • ${args0} = this is also the database name that is being passed to be used to name the backup file.  You could add more to the name to include a date as well, but I wanted to keep this example simple.

The sqlcmd options are:

  • -Q = query to run
  • -r0 = return errors
  • -b = terminates batch if there is an error

Running the Parallel SQL Server Backup PowerShell Script

For my test I had the following:

  • 32 databases with an average size of about 2.5GB
  • 78GB in total was backed up 

I opened the script using the PowerShell ISE. After adjusting the variables in the script to meet my needs, when this is run this is the output from PowerShell.

Note that while the ‘backup’ task is running, the ‘store’ task is running for a database that has been already been backed up.

Running the Parallel SQL Server Backup PowerShell Script
backup

Here is the comparison between the serial job which is using a SQL Server Agent Job to backup one database at a time, then copying the files and then deleting the local copy.  You can see this took 1 hour and 31 minutes.

duration for traditional backups with SQL Server Agent

Here is another SQL Server Agent Job, but this time using the PowerShell script. I used a maxThreads count = 4.  This took 1 hour and 9 minutes to run.

Job Duration for Parallel Backup PowerShell Script

Next Steps

  • Download the PowerShell script.
  • Modify this tasks per your needs and experiment with other tasks you may want to run in parallel.
  • This script also allows you to backup remote databases that may not have SQL Server Agent.
  • This script can be used to perform actions on other elements; it is not restricted to databases only.
  • The $query variable can be made more complex to check if the databases have already been archived; in that case, if the job is re-run, it’s not going to waste resources but instead work only on what is missing.
  • You can include other steps like DBCC or shrink the database, or restore the archived copy to verify it.

4 Comments

  1. Hi Ram, yes $database is “master” to run sp_databases, but you can comment out those lines if you will provide the list manually. In order to know why it is not working, take a look at the following lines and add your own debugging to them, $ObjectNameField is “DATABASE_NAME” as returned from sp_databases, $objects is an array with named columns, and $object is the actual database name:

    #start processing first task for each database
    for ($i=0; $i -lt $objects.length; $i++) {
    $object = $objects[$i].$objectNameField
    $tasks | where {$_[0] -eq $startAtTask} | foreach { CreateThread $object $i ([array]::IndexOf($tasks, $_)) $_[0] ([ref]$Jobs) }
    }

  2. Hi Pablo,
    Thank you for the information.
    The script is able to fetch the databases and assign them to variable $objects. But the backup command is never working and none of the databases are being backed up.

    Further to your article, $database is always master as the query sp_databases needs to be executed under master DB only.

  3. Hi Ram, the database names are assigned in this statement:
    $objects = @((Invoke-Sqlcmd -ServerInstance $server -Database $database -Query $query -querytimeout $queryTimeout) | where {$objectExclude -notcontains $_.$objectNameField} | sort $objectSortExpression )

    You need to modify it and include your own database names there.

Leave a Reply

Your email address will not be published. Required fields are marked *