MERGE - Upsert Operation (Insert or Update) in SQL

The MERGE statement in SQL is used to perform an "upsert" operation, which means it can either insert new records or update existing records in a target table based on certain conditions. The MERGE operation is a powerful feature that allows you to combine the logic of both INSERT and UPDATE into a single statement, simplifying your SQL queries and improving performance when working with data that may or may not already exist.

1. Basic Syntax of the MERGE Statement

The basic syntax of the MERGE statement includes the target table, the source table (or dataset), and the condition on which the merge operation is based. Depending on whether a match is found between the target and source, the MERGE statement can either update existing records or insert new ones.

Syntax:

      MERGE INTO target_table AS target
      USING source_table AS source
      ON target.column = source.column
      WHEN MATCHED THEN
        UPDATE SET target.column1 = source.column1, target.column2 = source.column2
      WHEN NOT MATCHED THEN
        INSERT (column1, column2) VALUES (source.column1, source.column2);

In this syntax:

target_table: The table that you want to update or insert data into.
source_table: The table or dataset providing the new data.
ON: The condition used to match records between the target and source tables.
WHEN MATCHED: Defines the actions to take when a match is found (typically an UPDATE).
WHEN NOT MATCHED: Defines the actions to take when no match is found (typically an INSERT).

2. Example: Basic MERGE Operation

Here is an example of how the MERGE statement works when you want to either update existing data or insert new data into a table. Consider the following example where we have a 'Customers' table, and we want to update their contact information or insert new records if the customer does not exist.

Example: MERGE to Update or Insert Customer Data

      -- Merging data into the 'Customers' table
      MERGE INTO Customers AS target
      USING NewCustomerData AS source
      ON target.CustomerID = source.CustomerID
      WHEN MATCHED THEN
        UPDATE SET target.ContactName = source.ContactName, target.ContactEmail = source.ContactEmail
      WHEN NOT MATCHED THEN
        INSERT (CustomerID, ContactName, ContactEmail)
        VALUES (source.CustomerID, source.ContactName, source.ContactEmail);

In this example, the MERGE statement checks the 'Customers' table against the 'NewCustomerData' table based on the 'CustomerID' column. If a customer with the same 'CustomerID' is found in the target table, the existing contact information is updated. If no match is found, a new customer record is inserted into the 'Customers' table.

3. Example: MERGE with Multiple Actions

The MERGE statement can also perform multiple actions based on different conditions. For instance, you can choose to delete records that no longer meet certain criteria while inserting or updating other records.

Example: MERGE with DELETE

      -- Merging data with deletion of obsolete records
      MERGE INTO Customers AS target
      USING NewCustomerData AS source
      ON target.CustomerID = source.CustomerID
      WHEN MATCHED AND source.ContactEmail IS NULL THEN
        DELETE
      WHEN MATCHED THEN
        UPDATE SET target.ContactName = source.ContactName, target.ContactEmail = source.ContactEmail
      WHEN NOT MATCHED THEN
        INSERT (CustomerID, ContactName, ContactEmail)
        VALUES (source.CustomerID, source.ContactName, source.ContactEmail);

In this example, the MERGE statement deletes records from the 'Customers' table if the corresponding 'ContactEmail' in the 'NewCustomerData' table is NULL. It updates existing records when a match is found and inserts new records when there is no match.

4. Example: MERGE with Complex Conditions

You can also use complex conditions in the WHEN MATCHED and WHEN NOT MATCHED clauses, allowing you to control the behavior of the MERGE statement with more specific logic.

Example: MERGE with Complex Conditions

      -- Merge data based on multiple conditions
      MERGE INTO Orders AS target
      USING NewOrderData AS source
      ON target.OrderID = source.OrderID
      WHEN MATCHED AND source.OrderStatus = 'Cancelled' THEN
        UPDATE SET target.Status = 'Cancelled', target.CancellationDate = GETDATE()
      WHEN MATCHED AND source.OrderStatus = 'Shipped' THEN
        UPDATE SET target.Status = 'Shipped', target.ShippedDate = GETDATE()
      WHEN NOT MATCHED THEN
        INSERT (OrderID, OrderDate, Status)
        VALUES (source.OrderID, source.OrderDate, source.Status);

In this example, the MERGE statement updates the 'Orders' table based on the 'OrderStatus' from the 'NewOrderData' table. If the status is 'Cancelled', it updates the status and cancellation date; if the status is 'Shipped', it updates the status and shipment date. If there is no match, a new order record is inserted.

5. Performance Considerations with MERGE

The MERGE statement is useful for handling complex data updates and inserts in a single operation, which can improve performance by reducing the number of queries. However, it can sometimes be more resource-intensive than performing separate INSERT and UPDATE statements, especially if the matching condition is complex or if the tables involved are very large. It's important to test and optimize your MERGE queries to ensure they are efficient for your use case.

6. Conclusion

The MERGE statement is a powerful tool in SQL for performing "upsert" operations. By combining the logic of both INSERT and UPDATE into one query, MERGE simplifies the process of managing data in scenarios where you need to either insert new records or update existing ones. It is important to use this statement carefully, as its complexity and impact on performance can vary depending on the specific use case and the amount of data being processed.

SQL Tutorials

Database Basics

DDL

DML

SQL Queries

Advance filtering

Functions

Grouping Data

Joins

Subqueries

Views

Indexing

Stored Procedures

Triggers

Concurrency Control

Database Security

Performance Optimization

Database Design

Advanced SQL

Data Analysis

NoSQL and New SQL