I run into the same problem in real systems: I have a set of new or corrected rows, and I need to sync a target table without writing three separate statements and three separate transaction blocks. When that job runs every minute or every hour, you start to care about correctness under concurrency, the number of locks you take, and how easy it is to audit what changed. That’s why I still reach for the SQL Server MERGE statement when the data shape is right. It gives you a single, declarative place to define how rows should be inserted, updated, or deleted based on a match condition.
In this post, I’ll show you how I use MERGE in SQL Server, where it shines, and where you should avoid it. I’ll go beyond syntax: you’ll see practical patterns, edge cases that bite, and performance expectations in real workloads. I’ll also include a fully runnable example with realistic data, plus guidance on modern workflows in 2026—things like change feeds, incremental loads, and AI-assisted reviews. If you’ve ever written a trio of INSERT/UPDATE/DELETE statements and felt unsure about race conditions, you’ll see a clearer path here.
What MERGE actually does (and why I use it)
I think of MERGE as a “reconcile” operator: take a source set and reconcile it with a target table based on a join condition. If the row exists in both, you can update it. If it’s in the source but not the target, you can insert it. If it’s in the target but not the source, you can delete it. You describe that logic once, inside a single transaction, which makes the intent obvious to future readers and reduces the chance of inconsistent logic across statements.
A simple analogy I use with teams is this: imagine you have yesterday’s inventory list and today’s inventory list. You want the warehouse system to reflect today’s list. Instead of telling someone to separately add items, adjust quantities, and remove old items, you hand them both lists and say “make list A look like list B.” That’s MERGE.
The key idea: MERGE requires a source table (or source query) and a target table. The merge condition defines “same row,” typically a primary key or natural key. Then you decide what to do in each case. It’s a powerful statement, but the power also means you need to be careful about non-unique matches, concurrency, and auditing.
The basic syntax I keep in my head
I rarely memorize the full syntax. I just remember the pattern and the clauses:
- MERGE target
- USING source
- ON join condition
- WHEN MATCHED THEN UPDATE
- WHEN NOT MATCHED THEN INSERT
- WHEN NOT MATCHED BY SOURCE THEN DELETE
Here’s a clean template I start from, with a few placeholders that I almost always keep explicit:
MERGE INTO dbo.TargetTable AS t
USING dbo.SourceTable AS s
ON t.BusinessKey = s.BusinessKey
WHEN MATCHED AND (t.HashValue s.HashValue)
THEN UPDATE SET
t.ColumnA = s.ColumnA,
t.ColumnB = s.ColumnB,
t.LastModifiedUtc = SYSUTCDATETIME()
WHEN NOT MATCHED BY TARGET
THEN INSERT (BusinessKey, ColumnA, ColumnB, HashValue, CreatedUtc)
VALUES (s.BusinessKey, s.ColumnA, s.ColumnB, s.HashValue, SYSUTCDATETIME())
WHEN NOT MATCHED BY SOURCE
THEN DELETE;
Notice a few things I do by default:
- I use a hash or version column to avoid unnecessary updates. This reduces log volume and index churn.
- I set timestamps in a consistent, UTC form so I can compare changes later.
- I explicitly use “BY TARGET” or “BY SOURCE” to avoid ambiguity.
This template helps me avoid common mistakes and makes sure I think about each case. If I do not want deletes, I remove that clause entirely instead of leaving it commented out.
A complete runnable example with realistic data
Let’s build a small, runnable example you can paste into a SQL Server session. I’m using a basic product catalog where the source list is a corrected snapshot. The goal is to sync the catalog table to the updated list.
-- Clean slate
IF OBJECT_ID(‘dbo.ProductList‘, ‘U‘) IS NOT NULL DROP TABLE dbo.ProductList;
IF OBJECT_ID(‘dbo.UpdatedList‘, ‘U‘) IS NOT NULL DROP TABLE dbo.UpdatedList;
-- Target table
CREATE TABLE dbo.ProductList
(
ProductId INT NOT NULL PRIMARY KEY,
ProductName VARCHAR(50) NOT NULL,
Price DECIMAL(10,2) NOT NULL,
LastModifiedUtc DATETIME2(3) NOT NULL DEFAULT SYSUTCDATETIME()
);
-- Source table
CREATE TABLE dbo.UpdatedList
(
ProductId INT NOT NULL PRIMARY KEY,
ProductName VARCHAR(50) NOT NULL,
Price DECIMAL(10,2) NOT NULL
);
-- Seed target
INSERT INTO dbo.ProductList (ProductId, ProductName, Price)
VALUES
(101, ‘Coffee‘, 15.00),
(102, ‘Biscuit‘, 20.00);
-- Seed source snapshot
INSERT INTO dbo.UpdatedList (ProductId, ProductName, Price)
VALUES
(101, ‘Coffee‘, 25.00),
(103, ‘Chips‘, 22.00);
-- Merge
MERGE INTO dbo.ProductList AS t
USING dbo.UpdatedList AS s
ON t.ProductId = s.ProductId
WHEN MATCHED AND (t.ProductName s.ProductName OR t.Price s.Price)
THEN UPDATE SET
t.ProductName = s.ProductName,
t.Price = s.Price,
t.LastModifiedUtc = SYSUTCDATETIME()
WHEN NOT MATCHED BY TARGET
THEN INSERT (ProductId, ProductName, Price, LastModifiedUtc)
VALUES (s.ProductId, s.ProductName, s.Price, SYSUTCDATETIME())
WHEN NOT MATCHED BY SOURCE
THEN DELETE;
-- Verify
SELECT ProductId, ProductName, Price
FROM dbo.ProductList
ORDER BY ProductId;
Expected output:
- Product 101 is updated to price 25.00
- Product 103 is inserted
- Product 102 is removed
This example is small, but it mirrors the pattern I use in production for daily or hourly syncs. The critical part is the matched predicate; without it, SQL Server would update even unchanged rows, which wastes IO and can cause unnecessary write locks.
Choosing the right source: table, CTE, or staging table
In real systems, your source is rarely a static table. I typically see three patterns:
1) Staging table (best for bulk loads)
You load data into a staging table first, validate it, and then merge. This isolates data quality issues and gives you a consistent input set. I prefer this for ETL or ELT pipelines.
2) CTE or derived query (best for transformations)
When I need to aggregate or clean data on the fly, I’ll MERGE from a CTE. Example: “use the latest price per product from the feed.”
3) Table-valued parameter (best for app-driven sync)
For small sets of changes, I sometimes pass rows from the app via a TVP and then MERGE into the target. It’s easy to keep the operation in a single round trip.
Here’s a quick example using a CTE as the source, which I use for data feeds that might contain duplicates:
WITH LatestFeed AS
(
SELECT
f.ProductId,
f.ProductName,
f.Price,
ROW_NUMBER() OVER (PARTITION BY f.ProductId ORDER BY f.ReceivedUtc DESC) AS rn
FROM dbo.FeedStaging AS f
)
MERGE INTO dbo.ProductList AS t
USING (SELECT ProductId, ProductName, Price FROM LatestFeed WHERE rn = 1) AS s
ON t.ProductId = s.ProductId
WHEN MATCHED AND (t.ProductName s.ProductName OR t.Price s.Price)
THEN UPDATE SET
t.ProductName = s.ProductName,
t.Price = s.Price,
t.LastModifiedUtc = SYSUTCDATETIME()
WHEN NOT MATCHED BY TARGET
THEN INSERT (ProductId, ProductName, Price, LastModifiedUtc)
VALUES (s.ProductId, s.ProductName, s.Price, SYSUTCDATETIME());
This avoids the “multiple source rows match the same target row” error by deduplicating before the MERGE.
Common mistakes I see (and how I avoid them)
I’ve reviewed a lot of MERGE statements in production. The same mistakes repeat, so I’ll show you how I avoid them.
1) Non-unique source keys
If the source has duplicates for a key, MERGE throws an error because a single target row would match multiple source rows. I fix this by deduplicating in a CTE using ROW_NUMBER and selecting only rn = 1, as shown above. If duplicates are data errors, I log them and stop the pipeline.
2) Updates without a change predicate
Updating unchanged rows creates extra logging and can trigger downstream processes (like CDC or triggers). I add a predicate in the WHEN MATCHED clause. I often compare a hash or a last-updated value to keep it cheap.
3) Unintended deletes
“WHEN NOT MATCHED BY SOURCE THEN DELETE” is powerful. It also can wipe a table if you feed an empty source. When deletes are safe, I keep them; otherwise I prefer to soft-delete or remove the clause entirely. Another safe pattern is to limit deletes by a filter or a date window.
4) Trigger side effects
MERGE can fire triggers in ways that surprise people, especially if you do multiple actions in one statement. If you rely on triggers for auditing or replication, test carefully. In some systems I replace triggers with an OUTPUT clause in the MERGE statement to capture changes.
5) Concurrency and race conditions
A MERGE statement is a single statement, but it still runs under your transaction isolation level. If multiple sessions merge the same target table, you can get deadlocks or unexpected updates. I mitigate this with proper indexing, consistent access paths, and, when needed, explicit hints like HOLDLOCK (used carefully).
When to use MERGE vs separate statements
I don’t use MERGE for everything. Here’s my rule of thumb, based on practical tradeoffs:
Use MERGE when:
- You have a clear source set and want the target to mirror it.
- Inserts, updates, and deletes are all possible outcomes.
- You want a single statement for auditing or atomicity.
- Your join key is stable and unique on both sides.
Avoid MERGE when:
- Your logic is complex with multiple conditional branches that are easier to reason about with separate statements.
- You need fine-grained error handling for each action.
- The source can’t be reliably deduplicated.
- You are working with very large tables and the merge plan is unstable (more on this below).
If you are unsure, I recommend running a quick comparison: implement the same logic with two or three statements, then measure log usage and runtime. On medium data sets (tens of thousands of rows), MERGE is often comparable or better. On large data sets, performance depends heavily on indexing and statistics.
Performance characteristics and tuning tips
Performance depends on your data size, indexes, and the shape of the merge. I’ll share the tuning checklist I follow.
1) Index the join keys
Your ON clause should match an indexed key in both source and target. If the source is a staging table, I create a clustered index or a unique index on the business key before the merge. That alone can drop runtime from minutes to seconds on large loads.
2) Reduce unnecessary updates
As mentioned, a change predicate in WHEN MATCHED can reduce log writes drastically. In practical runs, I see differences like 10–20% less IO in smaller tables, and in heavy write tables it can be a night-and-day difference.
3) Batch large merges
If you’re syncing millions of rows, consider batching the source and merging in chunks. That reduces lock duration and keeps the transaction log manageable. I often batch by key ranges or by a load timestamp.
4) Avoid scalar functions in the join
If you transform the join key, you can force scans. Do the transformation before the merge in a CTE or a computed column.
5) Use OUTPUT for auditing
If you need to capture changes, the OUTPUT clause is a performance-friendly choice compared to triggers. It gives you a record of what changed without extra round trips.
Here’s a simple example that logs inserted and updated rows into an audit table:
CREATE TABLE dbo.ProductAudit
(
AuditId BIGINT IDENTITY(1,1) PRIMARY KEY,
ActionType VARCHAR(10) NOT NULL,
ProductId INT NOT NULL,
OldPrice DECIMAL(10,2) NULL,
NewPrice DECIMAL(10,2) NULL,
AuditUtc DATETIME2(3) NOT NULL
);
MERGE INTO dbo.ProductList AS t
USING dbo.UpdatedList AS s
ON t.ProductId = s.ProductId
WHEN MATCHED AND (t.Price s.Price)
THEN UPDATE SET
t.Price = s.Price,
t.LastModifiedUtc = SYSUTCDATETIME()
WHEN NOT MATCHED BY TARGET
THEN INSERT (ProductId, ProductName, Price, LastModifiedUtc)
VALUES (s.ProductId, s.ProductName, s.Price, SYSUTCDATETIME())
OUTPUT
$action AS ActionType,
inserted.ProductId,
deleted.Price AS OldPrice,
inserted.Price AS NewPrice,
SYSUTCDATETIME() AS AuditUtc
INTO dbo.ProductAudit (ActionType, ProductId, OldPrice, NewPrice, AuditUtc);
This gives you a durable, queryable audit trail without triggers. It’s a pattern I’ve used for billing systems and inventory updates.
Edge cases I guard against
Edge cases are where MERGE can surprise you. Here are the ones I actively handle:
Multiple matches due to bad data
If your target table has duplicate keys due to data corruption, MERGE can update multiple rows when you expect one. I put a unique constraint on the business key to prevent this upfront.
Late-arriving data
If your source feed is not a full snapshot but a partial change list, you should not delete missing target rows. I either remove the delete clause or I filter it to only delete rows that meet specific criteria.
Nullable join keys
If your key columns can be NULL, the join might not behave the way you expect. I avoid NULLs in business keys. If I can’t, I normalize them in a CTE (for example, by replacing NULL with a sentinel value).
Temporal tables
When the target is a system-versioned temporal table, MERGE updates and deletes will automatically insert history. That can be great, but it can also grow quickly. I always test the history table growth and adjust retention.
Identity columns
If the target has an identity column, I avoid inserting explicit values unless I really need to. I keep identity as the surrogate key and use a separate business key for the merge.
Traditional vs modern patterns (2026 perspective)
In 2026, I see two broad ways teams synchronize data. The old pattern is multiple statements with manual transaction handling. The modern pattern uses MERGE, or at least a single-statement approach, often paired with change feeds and automated checks. Here’s how I compare them:
Traditional (separate INSERT/UPDATE/DELETE)
—
Requires manual transaction blocks
More verbose, higher risk of mismatch
More opportunities for race conditions
Often done via triggers
Logic spread across statements
I still use separate statements when logic is too complex or when I need different locking hints for each action. But for a straight “sync this table to that set of rows,” MERGE is my first choice.
Testing MERGE safely (my checklist)
You should test MERGE with the same rigor as any other data mutation statement. Here’s my checklist:
- Positive tests: matching rows update correctly; new rows insert.
- Negative tests: a missing source row does not update or delete when it shouldn’t.
- Duplicate source rows: ensures deduplication logic works.
- Empty source: verify that deletes do not wipe the target if that’s not intended.
- Concurrency tests: simulate two sessions merging into the same target to see if deadlocks occur.
I often wrap tests in a transaction and roll back to keep the environment clean. For performance tests, I use representative data volumes and watch the execution plan to ensure I get index seeks on the merge keys.
Practical guidance on locks and isolation levels
MERGE can be lock-heavy, especially on large tables. Here’s how I approach it:
- Default isolation (READ COMMITTED) is fine for most workloads, but if you need repeatable reads for consistency, consider SNAPSHOT or REPEATABLE READ.
- Lock hints like HOLDLOCK can reduce race conditions by serializing access, but they can reduce concurrency. I only add hints after I observe specific issues.
- Partitioning on large tables can help isolate merge operations to a subset of data, which reduces contention.
When I do run into blocking, I look at the execution plan and check for scans. A merge with a scan on the target table can lock a large range of rows. Adding a proper index usually fixes it.
Real-world scenarios where MERGE pays off
Here are a few use cases where I’ve seen MERGE provide clear value:
Inventory reconciliation
If you receive hourly inventory snapshots, MERGE is perfect for syncing product quantities and prices. I use a staging table to load the snapshot, deduplicate, and then merge.
Customer profile sync
When a CRM system exports customer updates, a MERGE statement can keep your internal profile table current, including updates to contact info and deletions of deactivated accounts.
Configuration synchronization
Feature flag settings or configuration values that change in a central system can be merged into local environments in a predictable, auditable way.
In each case, I favor MERGE because the logic maps exactly to the business requirement: “make this table match that source.”
When NOT to use MERGE (and what I do instead)
I mentioned earlier that MERGE isn’t universal. Here’s where I avoid it:
Complex conditional updates
If your update logic depends on complex branching (for example, multiple tiers of pricing rules), separate UPDATE statements are often clearer and easier to test. I’d rather have a few statements that are obvious than one big MERGE clause that’s hard to parse.
Heavy row-by-row validations
When each row requires validation in the app layer or stored procedure logic, I use a staging table and then batch operations. MERGE is too declarative for row-by-row custom validation.
Extreme scale with unpredictable plans
On massive tables, sometimes the optimizer picks a bad plan for MERGE. In that case, I switch to separate statements with carefully tuned hints. This is not common, but it happens in some data warehouse workloads.
In those cases, I prefer correctness and predictability over brevity.
How I explain MERGE to teammates
When I onboard engineers, I explain MERGE as a “set-based sync.” You provide two sets: the existing state and the desired state. The database figures out the row-level differences according to your rules. I also emphasize that MERGE is not magical: it still needs proper keys, good indexes, and careful thinking about deletes. If you treat it like a black box, you’ll end up with surprises.
I also encourage teams to include MERGE statements in code review checklists. The review should verify that:
- The join condition uses a unique key.
- The source is deduplicated if needed.
- Updates are limited to actual changes.
- Deletes are intentional and safe.
- Auditing is in place if required.
This keeps mistakes out of production.
Practical next steps you can take today
If you want to use MERGE confidently, here’s what I recommend you do next:
1) Identify one table that currently uses separate INSERT/UPDATE/DELETE statements. Rewrite it with MERGE and verify that the result set matches.
2) Add a simple audit table and OUTPUT clause to capture changes. This gives you visibility into what MERGE is actually doing.
3) Add a unique constraint on your business key if it’s not already there. MERGE depends on uniqueness for correctness.
4) Test the MERGE with an empty source dataset to confirm that deletes behave exactly as you expect.
These steps are small, but they help you build confidence and prevent accidental data loss.
I use MERGE because it expresses intent clearly: make the target look like the source. When the data is well-formed and the keys are unique, it’s concise, safe, and easy to review. When you add careful predicates, auditing, and proper indexes, it becomes a powerful tool for reliable synchronization. If you approach it with discipline—deduplicate your sources, avoid unnecessary updates, and be explicit about deletes—you’ll get the best of both worlds: clean code and predictable outcomes.
If you want to go further, consider combining MERGE with modern pipelines: use a staging table fed by a change feed, validate it with automated checks, and run MERGE in a controlled transaction. That pattern is still one of the most practical ways to keep systems consistent in 2026, especially when you need to reconcile data between services, vendors, or legacy systems. I recommend starting small, measuring results, and then expanding the pattern to other tables once you trust it.



