In my previous blog posting I talked about the basics of the INTERSECT set operation in SQL Server, what the differences are compared with an INNER JOIN, and why you need a very good supporting indexing strategy. Today I want to talk about the INTERSECT ALL operator that is not implemented in SQL Server.
INTERSECT ALL is part of the SQL specification, but SQL Server doesn’t care about it. The difference to the INTERSECT operator is very simple: INTERSECT ALL doesn’t eliminate duplicate rows. The nice thing is that you can simulate an INTERSECT ALL in SQL Server. Let’s try that by creating our 2 tables again, and by inserting some sample rows.
-- Create the 1st table CREATE TABLE t1 ( Col1 INT, Col2 INT, Col3 INT ) GO -- Create the 2nd table CREATE TABLE t2 ( Col1 INT, Col2 INT ) GO -- Insert some records into both tables INSERT INTO t1 VALUES (1, 1, 1), (2, 2, 2), (2, 2, 2), (3, 3, 3) INSERT INTO t2 VALUES (2, 2), (2, 2), (3, 3) GO
As you can see, the 2nd table consists of a duplicate record – the record with the values 2 appears twice in the table. When you now perform an INTERSECT between both tables, the records with the values of 2 just appear once in the result set. The duplicate row was just eliminated!
If you want to preserve duplicate rows, you have to make them unique. One approach here is to use the ROW_NUMBER() Windowing Function that was introduced back with SQL Server 2005. With that function you just generate a unique row number for every duplicate record. Therefore your duplicate records become unique, and the “duplicate” row is returned twice as expected. The following listing shows this technique.
-- You can preserve duplicate rows by making them unique with the ROW_NUNBER() Windowing Function. WITH IntersectAll AS ( SELECT ROW_NUMBER() OVER (PARTITION BY Col1, Col2 ORDER BY (SELECT 0)) AS RowNumber, Col1, Col2 FROM t1 INTERSECT SELECT ROW_NUMBER() OVER (PARTITION BY Col1, Col2 ORDER BY (SELECT 0)) AS RowNumber, Col1, Col2 FROM t2 ) SELECT Col1, Col2 FROM IntersectAll GO
Summary
One of the side-effects of the INTERSECT operation in SQL Server is that duplicate rows are eliminated and not returned in the result set. If you want to preserve them you have to make them unique, e.g. by applying a ROW_NUMBER() calculation.
Thanks for your time,
-Klaus
5 thoughts on “INTERSECT ALL in SQL Server”
You can use this trick to get EXCEPT ALL, too.
I think your code contains small bug – row_number must be partitioned by the both col1 and col2 values.
Otherwise you may obtain incorrect output, for example with the next data:
declare @t1 TABLE (
Col1 INT,
Col2 INT,
Col3 INT
)
declare @t2 TABLE (
Col1 INT,
Col2 INT
)
INSERT INTO @t1 VALUES (1, 1, 1), (2, 2, 2), (2, 2, 2), (3, 3, 3)
INSERT INTO @t2 VALUES (2, 2), (2, 1), (2, 2), (3, 3)
;WITH IntersectAll AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY (SELECT 0)) AS RowNumber,
Col1,
Col2
FROM @t1
INTERSECT
SELECT
ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY (SELECT 0)) AS RowNumber,
Col1,
Col2
FROM @t2
)
SELECT Col1, Col2 FROM IntersectAll
Regards,
YP
P.S. Grey foreground is so unreadable :((
Hello Yuri,
Thanks for your comment.
You are so right – the PARTITION BY was wrong!
I have already changed it.
Thanks!!!
-Klaus
Hi,
the link to the previouse blog posting refers to this site again…
best regards
Thomas
Hello Thomas,
Thanks for the info – already fixed it 🙂
Cheers,
-Klaus