Filtered Indexes in SQL Server

Today I want to talk about a specific kind of index in SQL Server: Filtered Indexes, which were introduced back with SQL Server 2008 – so already very long time ago. A Filtered Index is a traditional Non-Clustered Index, which includes a Filter Predicate that restricts the rows that are indexes in that specific index.

Advantages of Filtered Indexes

Let’s start now by looking at the advantages of Filtered Indexes, and where they can be beneficial. Let’s have a look at the column PurchaseOrderNumber of the table Sales.SalesOrderHeader of the AdventureWorks2014 database. As you can see from the following T-SQL query, there are a lot of NULL values in that column. In my case 27659 records out of 31465 records have a NULL value.

SELECT COUNT(*), COUNT(PurchaseOrderNumber) FROM Sales.SalesOrderHeader
GO

In the next step I create a Covering Non-Clustered Index for a specific query that I want to speed up.

-- Create a supporting Non-Clustered Index for the query
CREATE NONCLUSTERED INDEX idxTest 
ON Sales.SalesOrderHeader(PurchaseOrderNumber, SalesPersonID) 
INCLUDE (OrderDate, ShipDate)
GO

-- When we execute the query, SQL Server can use the previous created
-- Non-Clustered Index, but we have indexed a huge amount of NULL values that
-- we don't request from this query...
SELECT
	PurchaseOrderNumber,
	OrderDate,
	ShipDate,
	SalesPersonID
FROM Sales.SalesOrderHeader
WHERE PurchaseOrderNumber LIKE 'PO6%'
AND SalesPersonID IS NOT NULL
GO

When you run the query, you can see in the Execution Plan that the Non-Clustered Index is referenced by the Query Optimizer.

The Filtered Non-Clustered Index was chosen in the Execution Plan

When we check the size of the index through the Dynamic Management Function (DMF) sys.dm_db_index_physical_stats, you can see that the Non-Clustered Index consists of 132 pages: 1 index root page, and 131 pages in the leaf level.

-- The previous created Non-Clustered Index has 132 pages (1 + 131)
SELECT * FROM sys.dm_db_index_physical_stats
(
	DB_ID('AdventureWorks2014'),
	OBJECT_ID('Sales.SalesOrderHeader'),
	NULL,
	NULL,
	'DETAILED'
)
WHERE index_id = (SELECT index_id FROM sys.indexes WHERE name = 'idxTest')
GO

To make now the index structure more efficient, you can also create a Filtered Non-Clustered Index. When you think back to our query, we only request column values with a Non-NULL value in the column PurchaseOrderNumber. Therefore we can just filter the NULL values out from the index structure. So let’s recreate now the Non-Clustered Index as a Filtered Non-Clustered Index.

-- Recreate the Non-Clustered Index as a Filtered Index and filter out the NULL values
CREATE NONCLUSTERED INDEX idxTest 
ON Sales.SalesOrderHeader(PurchaseOrderNumber, SalesPersonID) 
INCLUDE (OrderDate, ShipDate)
WHERE PurchaseOrderNumber IS NOT NULL AND SalesPersonID IS NOT NULL
WITH (DROP_EXISTING = ON)
GO

When we check now again the index size through the DMF sys.dm_db_index_physical_stats, you will see now that the Filtered Non-Clustered Index consists of only 29 pages: 1 index root page, and only 28 pages in the leaf level.

In our case the query performance does’t get better, but you have created a much smaller index, and therefore you introduce less overhead in your database. Your Index Reorganize and Index Rebuild operations will be much faster, because they are dealing with a smaller index structure.

Disadvantages of Filtered Indexes

By now you know the advantages of Filtered Indexes in SQL Server. Let’s talk now about the disadvantages. Yes, there are also disadvantages! There are always some pro’s and con’s about everything. Like in real life!

Let’s work now with parameterized T-SQL statements. In my case I create a simple Stored Procedure that accepts a @PurchaseOrderNumber parameter.

CREATE PROCEDURE RetrieveData
(
	@PurchaseOrderNumber NVARCHAR(25)
)
AS
BEGIN
	SELECT
		PurchaseOrderNumber,
		OrderDate,
		ShipDate,
		SalesPersonID
	FROM Sales.SalesOrderHeader
	WHERE PurchaseOrderNumber = @PurchaseOrderNumber
END
GO

But when you execute now the Stored Procedure and you check the Execution Plan, you will see that the Query Optimizer hasn’t chosen the Filtered Index. You are scanning the whole Clustered Index of the table Sales.SalesOrderHeader.

The Filtered Non-Clustered Index was not chosen in the Execution Plan

Why is the Query Optimizer not referencing the Filtered Non-Clustered Index? Because it’s not safe! Imagine that initially the Filtered Index is referenced in the Execution Plan. That Execution Plan get’s cached in the Plan Cache. When you now execute the Stored Procedure again with a different parameter value, SQL Server can reuse the already cached Execution Plan from the Plan Cache.

But what happens if you specifiy a NULL value as input parameter value? Then the cached plan wouldn’t return you a result, because the Filtered Non-Clustered Index doesn’t know anything about the NULL values. So the query result would be nothing, which is wrong! And therefore the Query Optimizer can’t reference a Filtered Index in a parameterized T-SQL statement!

Summary

Filtered Non-Clustered Indexes are great – as long as you know their limitations. Normally parameterized T-SQL statements are great, because you don’t pollute the Plan Cache with Adhoc queries. But in combination with Filtered Non-Clustered Indexes in SQL Server they are somehow counterproductive. So please keep that in mind when you work on your indexing strategy.

Thanks for your time,

-Klaus

2 thoughts on “Filtered Indexes in SQL Server”

    1. Good pick. But I do get the idea that Klaus has tried to put across. A different example to showcase NCIs, with mostly non-null values in a column, say ‘Default’ and a small set of distinct values would bring out the difference using the stored procedure code used here.

Leave a Comment

Your email address will not be published. Required fields are marked *