Find Pending IO with sys.dm_io_pending_io_requests and sys.Dm_io_virtual_file_stats in SQL Server

Find Pending IO with sys.dm_io_pending_io_requests and sys.Dm_io_virtual_file_stats in SQL Server:

I had one article previously which was providing disk pressure and waits with help of sys.Dm_io_virtual_file_stats DMV. Click here to read that article.

Today I am showing one more use of sys.Dm_io_virtual_file_stats DMV along with sys.dm_io_pending_io_requests to find pending IO operation in your database. Curiosity is one of the good qualities of every DBA. They should have knowledge of exactly what is going on the SQL Server, what is the status of IO, how many IO operations are waiting for its turn and much more.

You can find one row for each pending IO operation in sys.dm_io_pending_io_requests. In order to get total pending IO operation for database, we have to SUM all the IO operation for each database file we have, following query does the same.

[sourcecode language=”sql”]SELECT

DB_Name(VFS.Database_id) AS DataBaseName

,DF.Physical_Name AS FileName

,SUM(IPR.IO_Pending) AS TotalPendingIO

FROM

sys.dm_io_pending_io_requests IPR

INNER JOIN

sys.Dm_io_virtual_file_stats(DB_ID(), NULL) VFS

ON

IPR.IO_Handle=VFS.File_Handle

INNER JOIN

sys.database_files df

ON (df.FILE_ID = vfs.FILE_ID)

GROUP BY

VFS.Database_id,DF.Physical_Name[/sourcecode]

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Quick Learn

Note: Microsoft Books online is a default reference of all articles .

Analyze IO disk pressure in SQL Server

Analyze IO disk pressure in SQL Server:

Before we move further, have a look at my following article as we are going to use the script given in that article:

Find Disk performance with sys.Dm_io_virtual_file_stats in SQL Server

I have already provided the script to display Disk IO waits in above given article, now we will further enhance that process.

By looking at IO Stall given in above article, can’t help us much. We should have data captured for few hours or may be a few days so that we can have details like which time/day we get high number of stall, we can find pattern of IO based on the statistics we capture for few hours/days. Now question comes into the picture, how can we do it with help of script given in “Find Disk performance with sys.Dm_io_virtual_file_stats in SQL Server” article?

Well we have to enhance the script give there. Let us start our journey to capture pattern of IO.

Step 1:

We have to create one table in database which keeps all the captured data for our IO.

[sourcecode language=”sql”]CREATE TABLE [dbo].[IOStats](
[Seq] [int] IDENTITY(1,1) NOT NULL,
[ServerName] [nvarchar](256) NULL,
[DatabaseName] VARCHAR(100),
[FileName] VARCHAR(MAX),
[SampleMs] [bigint] NULL,
[NumOfreads] [bigint] NULL,
[NumOfBytesRead] [bigint] NULL,
[IoStallReadMs] [bigint] NULL,
[NumOfWrites] [bigint] NULL,
[NumOfBytesWritten] [bigint] NULL,
[IoStallWriteMs] [bigint] NULL,
[IoStall] [bigint] NULL,
[SizeOnDiskBytes] [bigint] NULL,
[CreatedDate] [datetime] NULL
)
GO
ALTER TABLE [dbo].[IOStats] ADD  DEFAULT (getdate()) FOR [CreatedDate]
GO [/sourcecode]

Step 2:

Create one Stored Procedure which will insert data into “IOStats” table we have created above.

[sourcecode language=”sql”]
CREATE PROC SPInsertIOStat
AS
INSERT INTO [dbo].[IOStats]
(
ServerName
,DatabaseName
,FileName
,SampleMS
,NumOfReads
,NumOfBytesRead
,IoStallReadMs
,NumOfWrites
,NumOfBytesWritten
,IoStallWriteMs
,IoStall
)
SELECT
@@ServerName AS ServerName
,DB_Name(VFS.Database_id) AS DataBaseName
,DF.Physical_Name AS FileName
,VFS.Sample_Ms
,VFS.Num_Of_Reads
,VFS.Num_Of_Bytes_Read
,VFS.IO_Stall_Read_ms
,VFS.Num_Of_Writes
,VFS.Num_Of_Bytes_Written
,VFS.IO_Stall_Write_ms
,VFS.IO_Stall
FROM
sys.Dm_io_virtual_file_stats(NULL, NULL) VFS
INNER JOIN
sys.master_files DF
ON
(DF.FILE_ID = VFS.FILE_ID) AND (DF.DataBase_ID = VFS.DataBase_ID)
GO[/sourcecode]

Step 3:

Keep calling Stored Procedure “SPInsertIOStat” at certain time period. I generally prefer to call this SP at every 30 seconds, if you are sure about your business peak timing, you can call this SP in that time period only otherwise keep it for complete 24 hours cycle. You can use SSIS package and schedule that package in Windows Schedule task or you can call this SP directly from SQL Server Job Agent.

Step 4:

Since we are calling this SP at every 30 seconds interval, we have to aggregate Read/Write/Stall etc. for every minute and store the aggregated data in another table. Do aggregate data every morning for previous day so that we can generate different kind of report from this aggregation itself.

[sourcecode language=”sql”]CREATE TABLE [dbo].[IOStatsAggr](
[Seq] [int] IDENTITY(1,1) NOT NULL,
[ServerName] [nvarchar](256) NULL,
[DatabaseName] VARCHAR(100),
[FileName] VARCHAR(MAX),
[SampleMs] [numeric](18, 3) NULL,
[NumOfreads] [numeric](18, 3) NULL,
[NumOfBytesRead] [numeric](18, 3) NULL,
[IoStallReadMs] [numeric](18, 3) NULL,
[NumOfWrites] [numeric](18, 3) NULL,
[NumOfBytesWritten] [numeric](18, 3) NULL,
[IoStallWriteMs] [numeric](18, 3) NULL,
[IoStall] [numeric](18, 3) NULL,
[SizeOnDiskBytes] [numeric](18, 3) NULL,
[CreatedDate] [datetime] NULL,
[IOReads] [numeric](18, 3) NULL,
[IOWrites] [numeric](18, 3) NULL,
[IOBytesRead] [numeric](18, 3) NULL,
[IOBytesWrite] [numeric](18, 3) NULL,
[WaitStallReadMs] [numeric](18, 3) NULL,
[WaitStallWriteMs] [numeric](18, 3) NULL
)
GO

CREATE PROCEDURE [dbo].[PopulateIOStatsAggrData]
(
–generally I used to aggregate data of previous day always
–so kept logic of day. if you pass 1, it will aggregate data of previous day only
@Day int,
–I am interested to capture business hours data
–in our environment, we get maximum traffic in application From 8AM to 6PM, so I kept
–8 & 18. you can even aggregate data from 1 to 24, as per your requirement
@StartHour int, –8
@EndHour int –18
)
AS
begin
SET ARITHABORT OFF
SET ANSI_WARNINGS OFF
SET NOCOUNT ON

declare @StartTime datetime
, @EndTime datetime

SET @StartTime =  DATEADD(hh,@StartHour,DATEDIFF(DD,0,GETDATE()- @Day))

SET @EndTime =  DATEADD(hh,@EndHour,DATEDIFF(DD,0,GETDATE()- @Day))

;With cteIO
AS
(
select ROW_NUMBER() OVER (PARTITION BY ServerName,DatabaseName,FileName ORDER BY CreatedDate) as Rn, *
FROM IOStats IS1 (nolock)
WHERE (CreatedDate between @StartTime and @EndTime)
)
INSERT INTO [dbo].[IOStatsAggr]
(
ServerName,
DatabaseName,
FileName,
SampleMs,
NumOfreads,
NumOfBytesRead,
IoStallReadMs,
NumOfWrites,
NumOfBytesWritten,
IoStallWriteMs,
IoStall,
SizeOnDiskBytes,
CreatedDate,
–following are the aggregated fields
–I am more interested in that only
IOReads,
IOWrites,
IOBytesRead,
IOBytesWrite,
WaitStallReadMs,
WaitStallWriteMs
)
select
C1.ServerName,
C1.DatabaseName,
c1.FileName,
C1.SampleMs,
cast(c1.NumOfReads as numeric(18,3)) as NumOfReads,
cast(c1.NumOfBytesRead as numeric(18,3)) as NumOfBytesRead,
cast(c1.IoStallReadMs as numeric(18,3)) as IoStallReadMs,
cast(c1.NumOfWrites as numeric(18,3)) as NumOfWrites,
cast(c1.NumOfBytesWritten as numeric(18,3)) as NumOfBytesWritten,
cast(c1.IoStallWriteMs as numeric(18,3)) as IoStallWriteMs,
cast(c1.IoStall as numeric(18,3)) as IoStall,
cast(c1.SampleMs as numeric(18,3)) as SampleMs,
C1.CreatedDate,
cast((c1.NumOfReads – c2.NumOfReads)/((c1.SampleMs – c2.SampleMs)/1000.00) as numeric(18,3)) as IOReads,
cast((c1.NumOfWrites – c2.NumOfWrites) /((c1.SampleMs – c2.SampleMs)/1000.00) as numeric(18,3)) as IOWrites,
cast((c1.NumOfBytesRead – c2.NumOfBytesRead) /(c1.SampleMs – c2.SampleMs) as numeric(18,3)) as IOBytesRead,
cast((c1.NumOfBytesWritten – c2.NumOfBytesWritten) /(c1.SampleMs – c2.SampleMs) as numeric(18,3)) as IOBytesWrite,
cast((c1.IoStallReadMs – c2.IoStallReadMs)/(c1.NumOfReads – c2.NumOfReads) as numeric(18,3)) as WaitStallReadMs,
cast((c1.IoStallWriteMs – c2.IoStallWriteMs)/(c1.NumOfWrites – c2.NumOfWrites) as numeric(18,3)) as WaitStallWriteMs
from
cteIO C2
inner join cteIO C1
ON (C1.ServerName = C2.ServerName  AND C1.DatabaseName =C2.DatabaseName AND C1.FileName=C2.FileName AND C1.rn-1=c2.Rn )
end
GO

——Uncomment following SP call
——after you have atleast 24hours of data
——once you aggregate data for 24 hours,
——you can delete data for that time period from
—— IOStats table
–EXEC [dbo].[PopulateIOStatsAggrData]
–@Day = 1,  –1 is for previous day
–@StartHour = 4, –start capturing of data from 4 AM previous day
–@EndHour = 23 –capture data of previous day till 11
[/sourcecode]

Step 5:

Keep collecting and aggregating data for a week or so to capture the IO pattern for your database/server. You can return the result of all these capture by following Stored Procedure.

[sourcecode language=”sql”]
CREATE procedure [dbo].[SPGetIOState]
–it depends of the report you require,
–if you pass day, it will return the data between From/To Date
–by getting MAX day wise
@Unit varchar(10), –@Unit = ‘day’, — ‘Minute,Hour,Day,Month,Year’
@FromDate datetime,
@ToDate datetime
AS
begin
DECLARE
@FileGroups int,
@IOType int,
@SQL nvarchar(4000)

SET @SQL = ‘
SELECT
DATEADD(DD,0,DATEDIFF(DD,0,CreatedDate)) as CreatedDate,
DATEPART(‘+@Unit+’,CreatedDate) as Unit,
CEILING(MAX(ioReads)) as ioReadsPerSecond,
CEILING(MAX(IOWrites)) as IOWritesPerSecond ,
CEILING((MAX(IOBytesRead*1000))/1024) as IO_KBReadPerSecond ,
CEILING((MAX(IOBytesWrite*1000))/1024) as IO_KBWritePerSecond ,
CEILING(MAX(WaitStallReadMs)/ 1000) as WaitStallReadPerSecond ,
CEILING(MAX(WaitStallWriteMs)/ 1000) as WaitStallWriteMsPerSecond
FROM IOStatsAggr as I with (Nolock)
where CreatedDate between ”’+cast(@FromDate as varchar)+”’ and ”’+cast(@ToDate as varchar)+”’
group by DATEADD(DD,0,DATEDIFF(DD,0,CreatedDate)),datepart(‘+@Unit+’,CreatedDate)
order by Unit

–print @SQL
EXEC sp_executesql @SQL
End
GO

—-uncomment following SP to see the aggregated result
—-ofcourse, you have to change unit as per your need.
—-you can use Minute,Hour,Day,Month,Year etc.
—-and change the from/to date too
–EXEC [dbo].[SPGetIOState]
–@Unit=’HOUR’,
–@FromDate=’2012-10-12 06:00:00′,
–@ToDate=’2012-10-12 23:00:00′

[/sourcecode]

So, this is how you have to capture your IO pattern for few days and have to find whether you really having IO pressure or not. There are few different ways to capture IO pressure on disk but whenever there is a question of IO pressure on the server where I have only SQL Server installed, this is my favorite way. We will talk about different way (from windows OS) of capturing disk IO information sometime later, this is all for now…..

Have a great week ahead!!!

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Quick Learn

Note: Microsoft Books online is a default reference of all articles .

Find Disk performance with sys.Dm_io_virtual_file_stats in SQL Server

Find Disk performance with sys.Dm_io_virtual_file_stats

“sys.Dm_io_virtual_file_stats” is one of the excellent DMV in SQL Server. Generally people used talk about processor, memory and network a lot but always overlook disk capacity. You can’t read/write more than the capacity of your disk, no matter how many processor/core you have, how much RAM you assign to SQL Server or how fast your network is.

Whenever I have a situation to deal with performance issue of SQL Server, I never forget to use “sys.Dm_io_virtual_file_stats” DMV. I have very small script but it can save so many hours which you suppose to spend in finding a bottleneck of performance in SQL Server whenever disk is a culprit.

Each disk has capacity to read and write, if your workload is generating more read/write then the capacity of your disk, you can’t achieve optimal performance of other hardware you have.

Here is the script to measure disk IO capacity, have a look at it:

[sourcecode language=”sql”]SELECT
DB_Name(VFS.Database_id) AS DataBaseName
,DF.Physical_Name AS FileName
,VFS.Sample_Ms
,VFS.Num_Of_Reads
,VFS.Num_Of_Bytes_Read
,VFS.IO_Stall_Read_ms
,VFS.Num_Of_Writes
,VFS.Num_Of_Bytes_Written
,VFS.IO_Stall_Write_ms
,VFS.IO_Stall
,GETDATE() AS CurrentDate
FROM sys.Dm_io_virtual_file_stats(NULL, NULL) VFS
INNER JOIN sys.master_files DF ON (DF.FILE_ID = VFS.FILE_ID) AND
(DF.DataBase_ID = VFS.DataBase_ID)
[/sourcecode]

Let us look what information comes from each column written by above given script.

DababaseName: This column obviously, as name explains, gives the name of database from the SQL Server instance where you have ran the above query in.

FileName: Again this column obviously, as name explains, gives the name of data/log file.

Sample_ms: This column shows the number of milliseconds that have passed since the last server restarts. This DMV reset all column values when server restarts.

Num_Of_Reads: This column shows number of physical reads performs after the last server/services restarts.

Num_Of_Bytes_Read: This column shows the number of total bytes read as against total number of read recorded under Num_Of_Reads column.

IO_Stall_Read_ms: This column shows total number of milliseconds which has been spent in wait. For example you have received 30 reads in one millisecond in your data/log file but your disk is able to process only 5 reads per milliseconds then you will get as much increase in the number of this column as time taken to complete those 25 reads.

Num_Of_Writes: This column shows number of write performs after the last server/services restarts.

Num_Of_Bytes_Written: This column shows the number of total bytes write as against total number of write recorded under Num_Of_Write column.

IO_Stall_Write_ms: This column shows total number of milliseconds which has been spent in wait. For example you have received 30 write in one millisecond in your data/log file but your disk is able to process only 5 writes per milliseconds then you will get as much increase in the number of this column as time taken to complete those 25 writes.

IO_Stall: This column shows the sum of IO_Stall_Read_ms and IO_Stall_Write_ms.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Quick Learn

Note: Microsoft Books online is a default reference of all articles .