UNION and UNION ALL - Combining Results from Multiple SELECT Queries in SQL
The UNION and UNION ALL operators in SQL are used to combine the results of two or more SELECT queries into a single result set. Both operators allow you to merge the result sets, but they differ in how they handle duplicates.
1. UNION Operator
The UNION operator combines the result sets of two or more SELECT queries and removes any duplicate rows. It ensures that the final result set contains only unique rows.
The basic syntax of the UNION operator is:
SELECT column1, column2, ... FROM table1 UNION SELECT column1, column2, ... FROM table2;
In this syntax:
- table1 and table2 are the tables from which you are selecting data.
- column1, column2 are the columns from each table that you want to include in the result set.
- The UNION operator combines the results of both queries and removes any duplicate rows.
2. Example of UNION
Suppose you have two tables: Customers_2023 and Customers_2024, which store customer data for two different years. You want to get a list of unique customers from both tables. You can use the UNION operator:
SELECT CustomerID, Name FROM Customers_2023 UNION SELECT CustomerID, Name FROM Customers_2024;
This query will return a list of unique customers from both tables, removing any duplicate customers (based on the CustomerID and Name).
3. Example of UNION with Sample Data
Suppose the Customers_2023 table contains:
CustomerID | Name -------------------- 1 | Alice 2 | Bob 3 | Charlie
And the Customers_2024 table contains:
CustomerID | Name -------------------- 2 | Bob 4 | David 5 | Eve
The result of the UNION query would be:
CustomerID | Name -------------------- 1 | Alice 2 | Bob 3 | Charlie 4 | David 5 | Eve
As shown, the duplicate row for "Bob" (CustomerID 2) is removed, and the final result contains only unique customer records.
4. UNION ALL Operator
The UNION ALL operator also combines the results of two or more SELECT queries, but it does not remove duplicate rows. All rows from each query are included in the result set, even if they are duplicates.
The basic syntax of the UNION ALL operator is:
SELECT column1, column2, ... FROM table1 UNION ALL SELECT column1, column2, ... FROM table2;
In this syntax:
- table1 and table2 are the tables from which you are selecting data.
- column1, column2 are the columns from each table that you want to include in the result set.
- The UNION ALL operator combines the results of both queries but includes all rows, even if they are duplicates.
5. Example of UNION ALL
Using the same Customers_2023 and Customers_2024 tables, if you want to retrieve all customers, including duplicates, you can use the UNION ALL operator:
SELECT CustomerID, Name FROM Customers_2023 UNION ALL SELECT CustomerID, Name FROM Customers_2024;
This query will return all rows from both tables, including any duplicate customers.
6. Example of UNION ALL with Sample Data
With the same data in Customers_2023 and Customers_2024, the result of the UNION ALL query would be:
CustomerID | Name -------------------- 1 | Alice 2 | Bob 3 | Charlie 2 | Bob 4 | David 5 | Eve
As shown, the duplicate row for "Bob" (CustomerID 2) is not removed, and the result includes all rows from both tables.
7. Key Differences Between UNION and UNION ALL
- UNION removes duplicate rows from the result set, whereas UNION ALL includes all rows, even duplicates.
- UNION can be slower than UNION ALL because it performs an additional operation to eliminate duplicates.
- UNION ALL is generally faster than UNION because it does not require the removal of duplicate rows.
8. Performance Considerations
When working with large datasets, using UNION ALL can improve performance because it avoids the overhead of checking for and removing duplicate rows. However, if you need to eliminate duplicates in the result set, you should use UNION, even if it might be slower.
9. Conclusion
The UNION and UNION ALL operators are powerful tools for combining results from multiple SELECT queries. Use UNION when you want to remove duplicates and UNION ALL when you want to keep all results, including duplicates. Understanding when to use each operator can help optimize query performance and ensure that you get the desired results.