SQL Server Indexing Hacks: Proven Strategies to Supercharge Query Performance

Indexing can make or break query performance. Choosing the right index can drastically speed up queries, while the wrong one can drag things down. However, not all indexes serve the same purpose—each type is optimized for specific workloads and scenarios. In this article, you are going to learn when to use heaps, clustered and non-clustered indexes, covering and filtered indexes, or columnstore indexes to optimize performance. Let’s explore how to make indexing work in your favor!

Faster INSERT Performance for Logging Purpose

If you want to have faster logging performance, you can think of using heap. A heap is a table without clustered index. Though you can create non-clustered indexes. Heap stores data without specifying an order. As data is inserted without enforcing a strict order, the insert operation is usually faster than the equivalent clustered index insertion. If your heap's data need to read and process into a final destination, create a narrow non-clustered index that covers the search predicate used by the query.

Figure-1: Heap

Use heap, if:

  • Your insert operation is frequent with large and unordered data
  • You are doing it for logging/intermediate purpose
  • You can frequently truncate and re-create the table

DON'T Use heap, if:

  • When data is frequently returned in a sorted order
  • When the data is frequently grouped together
  • When ranges of data are frequently queried from the table
  • When there are no non-clustered indexes and the table is large. SQL Server will return all the rows which will make your query slower
  • If the data is frequently updated

Most of Your Queries are SELECTing Some Specific Rows

Let's say you have Orders table where all orders are stored. There is column status whose value is either processed or unprocessed. Your application is picking the unprocessed rows and mark them as processed. At any given point, there will be a few unprocessed orders. Your table may have millions of rows, but your application selects only a few. In this scenario, you may opt for Filtered Index.

A filtered index is a non-clustered index that allows you to apply specific conditions to cover a subset of rows in the table. The B-Tree will contain only those rows which satisfy the filter criteria used while creating the index. It will give performance boosting if you have a value that is used in a predicate very frequently and that value satisfies only a small amount of rows for that table.

Assume in AdventureWorks2022 database, [Sales].[SalesOrderHeader] has unprocessed rows with 0 and after successful processing it becomes 5. If most of the time, your query is selecting with status=0 in where clause. In such scenario, Filtered Index will be helpful.

Figure-2: Filtered Index

CREATE NONCLUSTERED INDEX ix_SalesOrderHeader_status
ON [Sales].[SalesOrderHeader] (status)
WHERE status = 0

Most of Your Queries are SELECTing Rows

If your queries are selecting all rows and some columns putting certain columns in the where clause. The order of columns are same and they don't change often. In this scenario, you can think of Covering Index. A covering index is a non-clustered index where all columns of SELECT statements are either part of the index keys or are specified in the included column clause of the index creation statement. As necessary columns are part of the index, so no extra IO is required additional columns. Thus, it boost up the performance. Assume in your Table1 there columns Col1, Col2, Col3, Col4, Col5 and Col6. In your select statement, you always select Col1, Col2, Col3, Col4, Col5 and keep Col2, Col3, Col4 in your where clause. In this example, you may index on Col2, Col3, Col4 and keep Col1 and Col5 as include clause.

CREATE NONCLUSTERED INDEX ix_Table1_Col1_Col2_Col3_Col4_Col5
ON Table1 (Col2, Col3, Col4)
INCLUDE (Col1, Col5)

Index on Lot of Duplicate Data or Can't Predict About Queries

Let's say your table contains lot of identical data. For example, nationality of the user like Bangladeshi, British, American etc and you need to index on it. In this scenario you need to tell SQL Server to use some kind of compression – either page compression for row-store indexes, or clustered column-store indexes. A column-store index is a column-based storage format to store the index data. It can be created as a clustered or as non-clustered index.

Figure-3: Column-store Index

Column-store index has another use case. If you have a table where you cannot predict what would be the possible queries like group by, order by, and especially if there are lot of running totals, then column-store indexes can help.

How Many Indexes are too Many

Sometimes question arise, how many indexes are too many for a table. As an answer, I love my favorite SQL Server Database Master Brent’s 5 and 5 Rule: aim for around 5 indexes per table, with around 5 columns (or less) on each. This is not a hard and fast rule. It is just a starting point.

  • Sometimes, 5 indexes aren’t enough - as you may need lot of columns to filter or sort. However, additional indexes will require more CPU/memory/space.
  • Sometimes, even just 5 indexes are too many - if you need faster insert/update/delete speed than select, you should drop the indexes to achieve the speed.
  • Sometimes, 5 columns are too many - if you index on VARCHAR(MAX) or hot columns (columns that constantly change) or include columns.
  • Sometimes, 5 columns aren’t enough - sometimes you may need to put more columns in include statement.

Final Words

All imaginary index scenarios covered here, have their own pros and cons. This article can be a starting point for your index design journey. It's better you start with regular clustered and non-clustered indexes. Once you bogged down by a performance issue, carefully analyze and examine your database and then adapt the solutions.

Add comment