Mehedi's Earth

As a DBA, you are often alert to the usual suspects—blocking, deadlocks, or long-running queries. But sometimes, performance issues creep in silently, and one of the quietest yet most impactful culprits is latch contention. Unlike locks, latches are lightweight synchronization primitives that protect internal memory structures. When these get overwhelmed, even the most well-tuned queries can slow to a crawl.

In this guide, you will explore what latch contention is, its type, how it differs from locks, when it occurs, how to detect it, and most importantly—how to tune your system to minimize its impact. Whether you are running a high-throughput OLTP workload or managing a reporting environment, understanding latches could be the missing piece in your performance puzzle.

What is Latch Contention

A latch is a short-term synchronization object that ensures data integrity on objects in SQL Server memory. As defined by Microsoft, latches are lightweight synchronization primitives that are used by the SQL Server engine to guarantee consistency of in-memory structures including; index, data pages, and internal structures, such as non-leaf pages in a B-Tree.

Types of Latches

There are three major types of latch contention. They are:

PAGELATCH_*

These latches are lightweight and non-configurable internal locks used by SQL Server to protect concurrent access to in-memory pages. For example, when the storage engine wants to access a data page in the buffer pool to send to the relational engine, it must first request a latch on that page. The latch is released once the operation has finished, and the duration is usually dependent on available memory. These latches do not involve disk I/O—they are purely in-memory access synchronization. These are buffer latches e.g. Page Free Space (PFS), Global Allocation Map (GAM), Shared Global Allocation Map (SGAM) and Index Allocation Map (IAM) pages.

PAGEIOLATCH_*

These latches are lightweight and non-configurable locks used by internal processes within SQL Server to manage access to the page buffer in memory. When SQL Server has to read pages from the disk into the memory buffer or from the buffer out to disk, it must place latches on the buffer pages while the processes take place. Consistently > 10 ms wait time indicates disk latency or slow I/O subsystem. These latches are IO latch.

LATCH_*

A latch is a lightweight synchronization mechanism used internally by SQL Server engine to protect in-memory structures (The structure is not related to buffers or transactions, but to some other process within SQL Server). Latches ensure thread-safe access to these memory structures by serializing access from multiple threads. On a busy OLTP system, latch wait is normal. You should worry about it unless latch contention and wait times affect CPU and throughput.

Latch Modes

Based on the lock acquisition and access level, latch modes can be:

KP (Keep latch) - A lightweight (in terms of performance) latch which signals to keep the structure alive thus protects from destroy. It is compatible with all latches except for the destroy (DT) latch.
SH (Shared latch) - This latch is required to read a structure like data page. When SH latch is in place, multiple threads can simultaneously read the structure.
UP (Update latch) - This latch does not update the page data itself. It remains in place until pages are successfully written to disk. This is compatible with SH and KP.
EX (Exclusive latch) - It blocks other threads from writing to or reading from the referenced structure, e.g. use modify contents of a page for torn page protection.
DT (Destroy latch) - It must be in place before destroying contents of referenced structure, e.g. a DT latch must be acquired by the lazywriter process to free up a page.

Latch Mode Compatibility

A latch mode may support or deny concurrently other latch mode also.

Latch mode	KP	SH	UP	EX	DT
KP	Y	Y	Y	Y	N
SH	Y	Y	Y	N	N
UP	Y	Y	N	N	N
EX	Y	N	N	N	N
DT	N	N	N	N	N

Reasons Behind Latch Contention

The main reasons behind latch contention are:

High number of CPUs usage - SQL Server may choose parallelism in execution plan e.g. for faster reading. For coordination, one thread may wait for other thread to complete its task.
Poor index design and data access patterns - Depth of B-tree, index design, size, fill factor, and read/write/delete activity impact on the latch contention.
Too often communication with Database - Application's too often communication with database without caching or batch processing may lead to latch contention.
Layout of logical files used by SQL Server databases can introduce the latch contention. For example, Page Free Space (PFS), Global Allocation Map (GAM), Shared Global Allocation Map (SGAM) and Index Allocation Map (IAM) pages have impact on contention.
Disk performance - PAGEIOLATCH waits indicate SQL Server is waiting on the I/O subsystem.

Common Latch Contention Scenario

Last page/trailing page insertion - For INSERT heavy scenario, when many users are doing INSERTs into a table with a clustered index on an increasing key (like IDENTITY), causing hotspot of PAGELATCH_EX/PAGELATCH_SH contention on one index page. This can be explained by SQL Server's steps to accomplish this:
- Navigate through B-tree to locate the correct page to add the new record.
- Use PAGELATCH_EX lock to prohibit others from modifying it, and place PAGELATCH_SH on all the non-leaf pages.
- Record a log entry that the row has been modified.
- Add the row to the page and mark the page as dirty.
- Unlatch all pages.
Small tables with a non-clustered index and random inserts - In such a table with high volume of concurrent select/insert/update/delete can lead to PAGELATCH_EX/PAGELATCH_SH latch contention if number of rows are small (a shallow B-tree with depth of two/three) and row size is small (dense pages).
Contention in PFS/GAM/SGAM pages - PFS (Page Free Space) tracks amount of free space in each DB file. There are one PFS page for every 8088 pages in each database file. Every time, there is an allocation/de-allocation of a page, a PAGELATCH_UP is in placed. In a busy system, there would be significant PAGELATCH_EX/PAGELATCH_SH contention. Similarly, GAM/SGAM contention may be surfaced in tempdb.

Diagnosis of Latch Contention

Use sp_BlitzFirst to identify your top wait type then sort by CPU and READS.
You can also use DBA Dash and Performance tab for your top type.
For PAGELATCH/PAGEIOLATCH contention check my earlier article.

Alternatively, you can also use below scripts.

-- WAITING TASKS ordered by wait_duration_ms
SELECT wt.session_id, wt.wait_type
, er.last_wait_type AS last_wait_type
, wt.wait_duration_ms
, wt.blocking_session_id, wt.blocking_exec_context_id, resource_description
FROM sys.dm_os_waiting_tasks wt
JOIN sys.dm_exec_sessions es ON wt.session_id = es.session_id
JOIN sys.dm_exec_requests er ON wt.session_id = er.session_id
WHERE es.is_user_process = 1
AND wt.wait_type <> 'SLEEP_TASK'
ORDER BY wt.wait_duration_ms desc;

GO;
------------------------------------------------------------

/* Snapshot the current wait stats and store so that this can be compared over a time period 
   Return the statistics between this point in time and the last collection point in time.
   
   **This data is maintained in tempdb so the connection must persist between each execution**
   **alternatively this could be modified to use a persisted table in tempdb.  if that
   is changed code should be included to clean up the table at some point.**
*/
use tempdb
go

declare @current_snap_time datetime;
declare @previous_snap_time datetime;

set @current_snap_time = GETDATE();

if not exists(select name from tempdb.sys.sysobjects where name like '#_wait_stats%')
   create table #_wait_stats
   (
      wait_type varchar(128)
      ,waiting_tasks_count bigint
      ,wait_time_ms bigint
      ,avg_wait_time_ms int
      ,max_wait_time_ms bigint
      ,signal_wait_time_ms bigint
      ,avg_signal_wait_time int
      ,snap_time datetime
   );

insert into #_wait_stats (
         wait_type
         ,waiting_tasks_count
         ,wait_time_ms
         ,max_wait_time_ms
         ,signal_wait_time_ms
         ,snap_time
      )
      select
         wait_type
         ,waiting_tasks_count
         ,wait_time_ms
         ,max_wait_time_ms
         ,signal_wait_time_ms
         ,getdate()
      from sys.dm_os_wait_stats;

--get the previous collection point
select top 1 @previous_snap_time = snap_time from #_wait_stats 
         where snap_time < (select max(snap_time) from #_wait_stats)
         order by snap_time desc;

--get delta in the wait stats  
select top 10
      s.wait_type
      , (e.waiting_tasks_count - s.waiting_tasks_count) as [waiting_tasks_count]
      , (e.wait_time_ms - s.wait_time_ms) as [wait_time_ms]
      , (e.wait_time_ms - s.wait_time_ms)/((e.waiting_tasks_count - s.waiting_tasks_count)) as [avg_wait_time_ms]
      , (e.max_wait_time_ms) as [max_wait_time_ms]
      , (e.signal_wait_time_ms - s.signal_wait_time_ms) as [signal_wait_time_ms]
      , (e.signal_wait_time_ms - s.signal_wait_time_ms)/((e.waiting_tasks_count - s.waiting_tasks_count)) as [avg_signal_time_ms]
      , s.snap_time as [start_time]
      , e.snap_time as [end_time]
      , DATEDIFF(ss, s.snap_time, e.snap_time) as [seconds_in_sample]
   from #_wait_stats e
   inner join (
      select * from #_wait_stats 
         where snap_time = @previous_snap_time 
      ) s on (s.wait_type = e.wait_type)
   where 
      e.snap_time = @current_snap_time 
      and s.snap_time = @previous_snap_time
      and e.wait_time_ms > 0 
      and (e.waiting_tasks_count - s.waiting_tasks_count) > 0 
      and e.wait_type NOT IN ('LAZYWRITER_SLEEP', 'SQLTRACE_BUFFER_FLUSH'
                              , 'SOS_SCHEDULER_YIELD','DBMIRRORING_CMD', 'BROKER_TASK_STOP'
                              , 'CLR_AUTO_EVENT', 'BROKER_RECEIVE_WAITFOR', 'WAITFOR'
                              , 'SLEEP_TASK', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT'
                              , 'FT_IFTS_SCHEDULER_IDLE_WAIT', 'BROKER_TO_FLUSH', 'XE_DISPATCHER_WAIT'
                              , 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP')

order by (e.wait_time_ms - s.wait_time_ms) desc ;

--clean up table
delete from #_wait_stats
where snap_time = @previous_snap_time;

GO;
-----------------------------------------------------------------------------
/*
The following script queries buffer descriptors to determine which objects are associated with the longest latch wait times.
*/

IF EXISTS (SELECT * FROM tempdb.sys.objects WHERE [name] like '#WaitResources%') DROP TABLE #WaitResources;
CREATE TABLE #WaitResources (session_id INT, wait_type NVARCHAR(1000), wait_duration_ms INT,
                             resource_description sysname NULL, db_name NVARCHAR(1000), schema_name NVARCHAR(1000),
                             object_name NVARCHAR(1000), index_name NVARCHAR(1000));
GO
declare @WaitDelay varchar(16), @Counter INT, @MaxCount INT, @Counter2 INT
SELECT @Counter = 0, @MaxCount = 600, @WaitDelay = '00:00:00.100'-- 600x.1=60 seconds

SET NOCOUNT ON;
WHILE @Counter < @MaxCount
BEGIN
   INSERT INTO #WaitResources(session_id, wait_type, wait_duration_ms, resource_description)--, db_name, schema_name, object_name, index_name)
   SELECT   wt.session_id,
            wt.wait_type,
            wt.wait_duration_ms,
            wt.resource_description
      FROM sys.dm_os_waiting_tasks wt
      WHERE wt.wait_type LIKE 'PAGELATCH%' AND wt.session_id <> @@SPID
--select * from sys.dm_os_buffer_descriptors
   SET @Counter = @Counter + 1;
   WAITFOR DELAY @WaitDelay;
END;

--select * from #WaitResources;

   update #WaitResources 
      set db_name = DB_NAME(bd.database_id),
         schema_name = s.name,
         object_name = o.name,
         index_name = i.name
            FROM #WaitResources wt
      JOIN sys.dm_os_buffer_descriptors bd
         ON bd.database_id = SUBSTRING(wt.resource_description, 0, CHARINDEX(':', wt.resource_description))
            AND bd.file_id = SUBSTRING(wt.resource_description, CHARINDEX(':', wt.resource_description) + 1, CHARINDEX(':', wt.resource_description, CHARINDEX(':', wt.resource_description) +1 ) - CHARINDEX(':', wt.resource_description) - 1)
            AND bd.page_id = SUBSTRING(wt.resource_description, CHARINDEX(':', wt.resource_description, CHARINDEX(':', wt.resource_description) +1 ) + 1, LEN(wt.resource_description) + 1)
            --AND wt.file_index > 0 AND wt.page_index > 0
      JOIN sys.allocation_units au ON bd.allocation_unit_id = AU.allocation_unit_id
      JOIN sys.partitions p ON au.container_id = p.partition_id
      JOIN sys.indexes i ON p.index_id = i.index_id AND p.object_id = i.object_id
      JOIN sys.objects o ON i.object_id = o.object_id
      JOIN sys.schemas s ON o.schema_id = s.schema_id;

select * from #WaitResources order by wait_duration_ms desc;
GO

/*
--Other views of the same information
SELECT wait_type, db_name, schema_name, object_name, index_name, SUM(wait_duration_ms) [total_wait_duration_ms] FROM #WaitResources
GROUP BY wait_type, db_name, schema_name, object_name, index_name;
SELECT session_id, wait_type, db_name, schema_name, object_name, index_name, SUM(wait_duration_ms) [total_wait_duration_ms] FROM #WaitResources
GROUP BY session_id, wait_type, db_name, schema_name, object_name, index_name;
*/

--SELECT * FROM #WaitResources
--DROP TABLE #WaitResources;

GO;

Resolution of Some Latch Contention

Last page/trailing page insertion - Use non-sequential value like GUID. Alternatively, use partition table. Though both techniques have some drawbacks.
Contention in PFS/GAM/SGAM pages
- Add data files to tempdb. For details check my article.
- Enable SQL Server Trace Flag 1118.
Check Paul Randal's article for diagnosis and resolution of LATCH_* contention.

Final Words

Latch contention may not always raise red flags like blocking or deadlocks, but its impact on performance can be just as severe—especially in high-concurrency environments. By understanding the nature of latches, monitoring wait types like PAGELATCH, PAGEIOLATCH, and LATCH, and applying targeted tuning strategies, DBAs can prevent small synchronization delays from turning into major performance issues. Proactive latch management is not just about resolving contention—it's about keeping SQL Server responsive, efficient, and ready to handle workload growth with confidence.

References

Going Further

If SQL Server is your thing and you enjoy learning real-world tips, tricks, and performance hacks—you are going to love my training sessions too!

Need results fast? I am also available for 1-on-1 consultancy to help you troubleshoot and fix your database performance issues.

Let’s make your SQL Server to take your business Challenge!

For any queries, mail to mamehedi.hasan[at]gmail.com.

Idea | Knowledge | Sharing

DBA's Guide to SQL Server Latch Contention and Resolution

What is Latch Contention

Types of Latches

PAGELATCH_*

PAGEIOLATCH_*

LATCH_*

Latch Modes

Latch Mode Compatibility

Reasons Behind Latch Contention

Common Latch Contention Scenario

Diagnosis of Latch Contention

Resolution of Some Latch Contention

Final Words

References

Going Further

Related posts

Add comment

DBA's Guide to SQL Server Extended Events

Generate Create Table and Data Insert Script

What the SQL Server's Top Wait Types are Whispering to You

DBA's Guide to SQL Server Latch Contention and Resolution

Poor VM Configuration: SQL Server is using Only 4 CPU Cores

DBA's Guide to SQL Server Baseline Creation

DBA's Guide to SQL Server Network Performance Analysis

DBA’s Guide to SQL Server CPU Performance Analysis

DBA's Guide to SQL Server Disk I/O and Performance

DBA’s Guide to SQL Server Memory Management