We have some tables in an on-premises SQL Server database that we archive data every day.
Currently, Wewe have a schedule task that runs a C# program, that write old data into a CSV.CSV file. EachEach batch that will be archived into a single CSV.CSV file were selected based on a date range.
The steps are simply:
Get data to archive by date range:
SELECT * FROM XXXTable WITH(NOLOCK) WHERE created_date >= '2025-11-03' AND created_date < '2025-11-10'.SELECT * FROM XXXTable WITH(NOLOCK) WHERE created_date >= '2025-11-03' AND created_date < '2025-11-10'(
SELECT *because of possible schema changes.)Save the data into a CSV
.CSVfile (we almost never need to look at them, as we also kept full db backups for a fairly big date range.)Delete the data:
DELETE XXXTable WHERE created_date >= '2025-11-03' AND created_date < '2025-11-10'DELETE XXXTable WHERE created_date >= '2025-11-03' AND created_date < '2025-11-10'
Now the tables being locked when these tasks are running were causing problem, and I "fixed" the issue by adding indexes for the related created_date columns, or sometimes status+created_date columns.... In a few casecases, I need to make the date range smaller (from monthly to weekly or even daily.).
For some tabletables, we even need to inner join their parent table for the created_datecreated_date because it wasn't stored on these tables.
I wonder if there is a better way (in terms of minimal time locking the tables) of doing this whole archiving process? II also feel like there must be a more space-efficient way to do this than to store rows straight into CSVto .CSV files. We are on SQL Server 2017 Standard Edition (so partition tables with the cool SWITCH syntax is not available as far as I know, even if we are able to implement them.).
P.S.PS: I'm also looking into Asynchronous Availabitliy Group so that we only need to delete the rows in the scheduled tasks. However, I'm not sure how hard this will be and what problem it will cause.