Tag Archives: index

Move clustered index to different filegroup in SQL Server

Move clustered index to different filegroup in SQL Server

Disk IO system always play vital role in performance of SQL Server. No matter how powerful processor you have and how much RAM you have, if your disk doesn’t support transaction as per your need, you won’t get benefit of processor and RAM.

Recently, we have attached very fast SAN to our production server to replace RAID-5 array which had four hard disk of 10,000RPM. I wanted to transfer all clustered index to our new SAN system for few of our highly transactional table in the first phase.

I want to share how we can efficiently move index to a different filegroup in different drive.

Let us create one sample database for demonstration and create one table with sample data and index in newly created database.


CREATE DATABASE ExtremeAdvice
 GO

USE ExtremeAdvice
 GO

IF OBJECT_ID('orders', 'U') IS NOT NULL BEGIN
  DROP TABLE orders
 END
 GO

CREATE TABLE orders (OrderID INT IDENTITY, OrderDate DATETIME, Amount MONEY, Refno INT)
 GO

--inserting 100000 fack rows into table
 INSERT INTO orders (OrderDate, Amount, Refno)
 SELECT TOP 100000
  DATEADD(minute, ABS(a.object_id % 50000 ), CAST('2010-02-01' AS DATETIME)),
  ABS(a.object_id % 10),
  CAST(ABS(a.object_id) AS VARCHAR)
 FROM sys.all_objects a
 CROSS JOIN sys.all_objects b
 GO

--creating clustered index
 CREATE CLUSTERED INDEX idx_Orders_refno ON orders(refno)
 GO
 

When you create database, by default there would be one default filegroup which is PRIMARY and default data file comes under PRIMARY filegroup.

Let us see where our index is located.

 SELECT
 obj.name AS TableName
 , obj.type
 , ind.name AS IndexName
 , ind.index_id AS Index_ID
 , files.name AS FileGroupName
 FROM sys.indexes ind
 INNER JOIN sys.filegroups files
 ON ind.data_space_id = files.data_space_id
 INNER JOIN sys.all_objects obj
 ON ind.object_id = obj.object_id
 WHERE ind.data_space_id = files.data_space_id
 AND obj.type = 'U' -- User Created Tables
 GO
 

Here is the screen capture for my database:

1FileGroup

Now, we shall create one more filegroup for our database and shall create one new datafile (.ndf) in new filegroup.

 --add new filegroup
 USE [master]
 GO

ALTER DATABASE [ExtremeAdvice] ADD FILEGROUP [ExtremeAdviceIndex]
 GO

--add new secondary data file to filegroup
 USE [master]
 GO

ALTER DATABASE [ExtremeAdvice]
 ADD FILE (
 NAME = N'ExtremeAdvice_Index',
 FILENAME = N'D:\DATA\ExtremeAdvice_Index.ndf' ,
 SIZE = 6144KB ,
 FILEGROWTH = 1024KB )

TO FILEGROUP [ExtremeAdviceIndex]
 GO
 

Once you are done with new filegroup and data file. We shall CREATE INDEX with DROP_EXISTING = ON and we shall assign new filegroup while re-creating an index. Once you transfer clustered to different filegroup, all data will be transferred to new location with clustered index.

 USE [ExtremeAdvice]
 GO

CREATE CLUSTERED INDEX [idx_Orders_refno] ON [Orders]
 (
 [refno] ASC
 )
 WITH (
 PAD_INDEX  = OFF,
 STATISTICS_NORECOMPUTE  = OFF,
 SORT_IN_TEMPDB = OFF,
 IGNORE_DUP_KEY = OFF,
 DROP_EXISTING = ON,
 ONLINE = ON,
 ALLOW_ROW_LOCKS  = ON,
 ALLOW_PAGE_LOCKS  = ON
 ) ON [ExtremeAdviceIndex]
 GO
 

Once, you are done with creating an index, you can confirm whether your index is moved to new location or not with following query.

 SELECT
 obj.name AS TableName
 , obj.type
 , ind.name AS IndexName
 , ind.index_id AS Index_ID
 , files.name AS FileGroupName
 FROM sys.indexes ind
 INNER JOIN sys.filegroups files
 ON ind.data_space_id = files.data_space_id
 INNER JOIN sys.all_objects obj
 ON ind.object_id = obj.object_id
 WHERE ind.data_space_id = files.data_space_id
 AND obj.type = 'U' -- User Created Tables
 GO
 

Here is the screen capture for my database.

2Filegroup

 

You can see that “idx_Orders_Refno” is  on “ExtremeAdviceIndex” FileGroup.

I have written some more articles about Index, if you want to refer, follow the links given:

  • Some basics about Index in SQL Server (Click Here)
  • Clustered Index in SQL Server (Click Here)
  • Nonclustered Index in SQL Server (Click Here)
  • Included Column Index with non clustered index in SQL Server 2005/2008/Denali (Click Here)
  • Filtered Index in SQL Server 2008/Denali (Click Here)
  • Cannot create index on view Msg 1940, Level 16, State 1, Line 1 (Click Here)
  • Calculate total size of clustered and non-clustered index for database in SQL Server 2008/2005 (Click Here)
  • Full Text Catalog and Full Text Index (Integrated Full Text Search – iFTS in SQL Server 2008 Part 2) (Click Here)
  • Do you know Index Statistics in SQL Server (Click Here)
  • Index Rebuild or Reorganize in SQL Server (Click Here)
  • Index Fillfactor in SQL Server (Click Here)
  • Find missing Index with DMVs in SQL Server 2005/2008/Denali (Click Here)
  • Find unused index in SQL Server 2005/2008/Denali (Click Here)
  • sys.dm_db_index_usage_stats and sys.dm_db_index_operational_stats in SQL Server 2005 (Click Here)

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Index Fill Factor in SQL Server

Today once again I felt to write something about Index, so open up the list of Index articles I have written, find out the topic which are still missing for concept of Index and thought to write about fill factor.
Generally, when you create an Index, your data get stored in B-Tree format where you can find “Root Page”, may be “Intermediate Page” and “LEAF Page”. “Leaf Page” would have your actual data sorted in order of Index key and each “Leaf Page” would be of 8KB.
When you insert/update/delete data to/from table, it tries to insert in the proper data page according to the record you are inserting. For example, you have Index on SSN number you are inserting new row with SSN number, SQL Server tries to insert that record in the appropriate page, suppose your SSN starts with “2” so it will find what is the last page which has SSN number starts with “2”, SQL Server will insert your new row in that page only. If your 8 KB page is full and don’t have room to accommodate new row whose SSN starts with “2”, it will split page and the data in that page will be shared between two pages so now you have two pages which are half full so your row will be accommodated in that page.
If your page would already had space for accommodating new row, you wouldn’t need to wait for extra I/O overhead and wait until page split gets finish and space for your row would become available.
This is the time when FillFactor comes into the picture. Fill Factor decides how much your page would be filled up initially. Suppose you give 10 in FillFactor than your data page will consume only 10% of your 8KB page size and when you exceed this limit of 10%, it keeps 90% page empty and create new page for other records.  Now, when you insert new records, you don’t need to worry about I/O overhead of page split as you would have 90% free space and your record will be accommodate in that space easily. So, if you have lower number of Fillfactor, you can decrease I/O over head generated by Page Split which helps you to write your data faster.
Now, you might be thinking now that why shouldn’t I use low number always like 5% or 10% in Fillfactor? Well, it will decrease your page split but it will increase number of data page in your index so every time you scan your index, you have to read more pages and it is again over head while reading the data and decrease the speed of reading data. Suppose you have 10 records in one table and its size is 8KB, you can fit all 10 records in one page only but if you have fill factor of 50 than those 10 records will be stored in two pages rather than one while creating an index.
Uff, what to do now? if we decrease fillfactor figure, it helps in writing but creates problem in reading and if we increase fillfactor figure may be 100% than helps in reading but creates issues while writing in database?
You have to be reasonable and have to take decision based the load on your database, you have to decide first that you have more writes or more read?
Personally I majority go for figure somewhere between 80 to 90 for fillfactor so that you have some space remain for new records and decrease page split at the same time, we don’t keep too much free space in data pages so number of pages doesn’t get increased. Even, it is not final figure I use for all my databases, if my database is read only or very rare insert/update, I would go even for 100% Fillfactor and if my database is write intensive, I would have even 50% Fillfactor.
BTW, you can define Fill Factor while creating an Indexor rebuilding an Index or you can set default value for fill factor via following T-SQL (by default it is 0 in SQL Server).
–turning on advanced configuration option
Sp_configure ‘show advanced options’,1
GO
RECONFIGURE
GO
–setting up fill factor
sp_configure ‘fill factor’, 90
GO
RECONFIGURE
GO
Reference: Ritesh Shah

http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Understand SET STATISTICS IO as a first step for performance tuning in SQL Server

SET STATISTICS IO” provides us very crucial information about the query we run, like scan count, logical read, physical read etc. but this command is really very ignored. While it’s coming to query optimization, many newbie or less experience person used to talk about DTA, profiler etc. but they JUST IGNORE simple yet powerful command “SET STATISTICS IO” whereas they first need to look at the information provided by “SET STATISTICS IO” so that you can move ahead and dig more details with the information provided by “SET STATISTICS IO”.
Let us see how it practically useful for us.
We are going to create one database, named “SQLHub” and one table, named “Orders” under “SQLHub” database. “Orders” table would have approx 1,00,000 rows dynamically inserted.
–create one database which you can delete after running this example
create database SQLHub
GO
USE SQLHub
GO
–if orders table is already there. you can delete it than create new one with name “Orders”
IF OBJECT_ID(‘orders’, ‘U’) IS NOT NULL BEGIN
      DROP TABLE orders
END
GO
–creating table
CREATE TABLE orders (OrderID INT IDENTITY, OrderDate DATETIME, Amount MONEY, Refno INT)
GO
–inserting 100000 fack rows into table (BTW, thank to Jacob Sebastian, copy in INSERT script from one of his demo to generate big table)
INSERT INTO orders (OrderDate, Amount, Refno)
SELECT TOP 100000
      DATEADD(minute, ABS(a.object_id % 50000 ), CAST(‘2010-02-01’ AS DATETIME)),
      ABS(a.object_id % 10),
      CAST(ABS(a.object_id) AS VARCHAR)
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
Now we are ready to see how query works. Let us create one simple “SELECT” T-SQL query with “SET STATICTICS IO”.
SET STATISTICS IO ON
–you might have different refno in your table, so please check it first
SELECT * from orders where refno=23
SET STATISTICS IO OFF
–here is the stats info came via “SET STATISTICS ON” in “Message” tab.
–(1982 row(s) affected)
–Table ‘orders’. Scan count 1, logical reads 409, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
After running this query, you would get its results in result tab and some stats in “Message” tab, look at the screen capture.

You can see “Logical Read 409”. To find out all rows which has refno 23, SQL Server has to go for 409 pages internally. Now let us create an Index on RefNo column and see stats.
–creating clustered index, however it is not meeting the criteria I have
–given in article, but this is ok for example
CREATE CLUSTERED INDEXidx_refno ON orders(refno)
GO
After creating “Clustered Index” on “Orders” table for “refno” column, let us see the same query.
SET STATISTICS IO ON
SELECT * from orders where refno=23
SET STATISTICS IO OFF
–here is the stats info came via “SET STATISTICS ON” in “Message” tab.
–(1982 row(s) affected)
–Table ‘orders’. Scan count 1, logical reads 14, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
GO

See screen capture of this query:

 

You can see now “Logical Read 14” rather than “409”, you see how much difference clustered index have made? And you don’t need to go to profiler to see this difference, even no need for execution plan in this case. However I am not against the use of profiler and execution plan. They both are very intelligent and useful tools but just wanted to clear it up that; we can get important information from simple command like “SET STATISTICS IO” too.
Apart from “Logical Read”, it provides you many more information in “Message” tab which you can see above like “Scan Count”, “Physical Read”, “Read-Ahead reads”, “log logical reads”, “lob physical reads” etc. you can get description of all these from Microsoft’s MSDN.
If you want to refer all other articles related to index, click here.
Reference: Ritesh Shah

http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Find missing Index with DMVs in SQL Server 2005/2008/Denali

In the previous articles I saw how to find unused index so that you can find it and drop it to improve performance of your Insert /Update /Delete statement and claim some disk space which is really useful in production server.
Today I will be exploring the script to find the missing index, after finding that index; you can decide whether to create it or not based on the requirement of your application.
There are many important DMVs (Dynamic Management View) there in SQL Server 2005 and higher version which are keeping information you need to know to find missing index. Following is the list of those DMVs.
sys.dm_db_missing_index_details”:  This DMV returns details about missing index you need to create. For more information on this, please click here.
sys.dm_db_missing_index_group_stats”: This DMV returns the summary of benefit you would have received if you would have the particular index. For more information on this, please click here.
sys.dm_db_missing_index_groups”: This DMV returns information about what missing index are contained in what missing index group handle. For more information on this, please click here.
sys.dm_db_missing_index_columns(Index_Handle)”:  This DMV gives you an idea about what columns are missing in Index, it is based on the “Index_Handle” field of “Sys.dm_db_missing_index_details” DMV. For more information on this, please click here.
Let us run all these four DMVs to see what it has for us:
select * from sys.dm_db_missing_index_details
select * from sys.dm_db_missing_index_group_stats
select * from sys.dm_db_missing_index_groups
–43816 is one of the I have copided from my “Index_Handle” column of
–sys.dm_db_missing_index_details DMV, you might get something else than 43816
select * from sys.dm_db_missing_index_columns(43816)
So these are the DMVs which will be useful in order to find missing index, we are going to use first three of the above DMVs to find our missing index.
Here you go!!!!
SELECT
      avg_total_user_cost *avg_user_impact * (user_seeks +user_scans) AS PossibleImprovement
      ,last_user_seek
      ,last_user_scan
      ,statement AS Object
      ,'CREATE INDEX [IDX_' + CONVERT(VARCHAR,GS.Group_Handle) + '_' + CONVERT(VARCHAR,D.Index_Handle) + '_'
      + REPLACE(REPLACE(REPLACE([statement],']',''),'[',''),'.','') + ']'
      +' ON '
      + [statement]
      + ' ('+ ISNULL (equality_columns,'')
    + CASE WHEN equality_columns IS NOT NULL AND inequality_columns IS NOT NULL THEN ',' ELSE '' END
    + ISNULL (inequality_columns, '')
    + ')'
    + ISNULL (' INCLUDE (' + included_columns + ')', '')
      AS Create_Index_Syntax
FROM
      sys.dm_db_missing_index_groups AS G
INNER JOIN
      sys.dm_db_missing_index_group_stats AS GS
ON
      GS.group_handle = G.index_group_handle
INNER JOIN
      sys.dm_db_missing_index_details AS D
ON
      G.index_handle = D.index_handle
Order By PossibleImprovement DESC
This is just a basic advice from DMVs regarding what indexes are missing and you have to create it, finally it’s up to you based on your requirement whether to create index or not. You have to see the table name and column whether it has any selectivity or not then decide whether to create that or not as more index on table might improve performance of your SELECT but it will harm other DML statements so it is always advisable to use your human skills to decide rather than leave everything on DMVs.
These DMVs could keep information for maximum of 500 indexes.
Enjoy Indexing!!!!
If you want to refer all other articles related to index, click here.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Find unused index in SQL Server 2005/2008/Denali

If you are aware with Index well or if you have read all my previous articles related to Index, you might aware that Index can increase the speed of SELECT statement but can reduce INSERT/UPDATE/DELETE performance so it is better to remove Unused index, it will not only give benefit to INSERT/ UPDATE/ DELETE but it will free up some disk space resources too.
So, after this clarification you understand the requirement of deleting unused Index, right? But how to find which index has never been used? Well I have written one small snippet of TSQL for the same which I am going to share with you.
Note: statistics you are going to see with below given script, would be refreshed and start collecting data again from zero if you restart you server instance or database.  So, first decide your business cycle, let SQL Server collect data and then run the following script to know exact situation otherwise it may happen that some query runs regularly so you can see its stats and few run only once or twice in a month or a quarter and you don’t see its stats and based on that you drop the index which affect the query when it start running at its regular time after a month or quarter. 
Here is the code which I was talking about, earlier:
–following query will show you which index is never used
select
      ind.Index_id,
      obj.Name as TableName,
      ind.Name as IndexName,
      ind.Type_Desc,
      indUsage.user_seeks,
      indUsage.user_scans,
      indUsage.user_lookups,
      indUsage.user_updates,
      indUsage.last_user_seek,
      indUsage.last_user_scan,
      ‘drop index [‘ + ind.name + ‘] ON [‘ + obj.name + ‘]’ as DropCommand
from
      Sys.Indexes as ind JOIN Sys.Objects as obj on ind.object_id=obj.Object_ID
      LEFT JOIN  sys.dm_db_index_usage_statsindUsage
            ON
                  ind.object_id =indUsage.object_id
                  AND ind.Index_id=indUsage.Index_id
where
      ind.type_desc<>‘HEAP’ and obj.type<>‘S’
      AND objectproperty(obj.object_id,‘isusertable’) = 1
      AND (isnull(indUsage.user_seeks,0)=0 AND isnull(indUsage.user_scans,0)=0 and isnull(indUsage.user_lookups,0)=0)
order by obj.name,ind.Name
–following query will show you list of ALL index in database
–along with data how many times it get seek, scan, lookup or update 
select
      ind.Index_id,
      obj.Name as TableName,
      ind.Name as IndexName,
      ind.Type_Desc,
      indUsage.user_seeks,
      indUsage.user_scans,
      indUsage.user_lookups,
      indUsage.user_updates,
      indUsage.last_user_seek,
      indUsage.last_user_scan
from
      Sys.Indexes as ind JOIN Sys.Objects as obj on ind.object_id=obj.Object_ID
      LEFT JOIN  sys.dm_db_index_usage_statsindUsage
            ON
                  ind.object_id =indUsage.object_id
                  AND ind.Index_id=indUsage.Index_id
where
      ind.type_desc<>‘HEAP’ and obj.type<>‘S’
      AND objectproperty(obj.object_id,‘isusertable’) = 1
order by obj.name,ind.Name
Be sure before droping any index, give it a second thought before deleting it. This is usually a good practice if you are doing this on production server.
if you want to refer all other articles related to index, click here.
Reference: Ritesh Shah

http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Index Reorganize and Rebuild in SQL Server

I have already mentioned quite a few times since now in few of my previous articles that Index could be the key to boost up performance of your “SELECT” query but highly fragmented Index could degrade the performance of your query.
When you created an index, sorting are done and in case of clustered index, physical data get stored page wise but after regular Insert/Update/Delete in the same table, fragmentation comes into the picture where your physical data order doesn’t match up with your logical data order exists in data page.  If there is any heavy fragmentation you find for any index, you should try to remove this by using Index reorganize or Index rebuild.
Let us understand what the exact meaning of Reorganize and Rebuild is.
Reorganize Index defrag the fragmented pages at leaf level, in simple language, it arrange data in leaf page level and if it find any empty space in any page, it removes it so free space could be claimed. This action is online action; means while doing this action on live server, it won’t block any objects (like table) for long time and you can query your object (database Table) while this operation in progress. Reorganize Index consume less resource as compared with Rebuild Index but Reorganize Index is ideal for low fragmented Index, let us say if your Index fragmented percentage is between 5% to 40%, you can use Reorganize Index otherwise go for Rebuild Index to gain proper benefit.
BTW, if you index fragmentation is less than 5%, then don’t need to do anything as even after reorganizing or rebuilding, you won’t get any more boost up as less than 5% fragmentation is not really a big deal and that is why, you don’t need to add any overhead on your server by doing reorganizing or rebuilding.
Rebuild Indexdrops current index and recreate Index again, this consumes high resources of the servers but it is worth doing if you Index fragmentation percentage is higher, let’s say more than 40%. While doing Rebuilding, objects get locked so you won’t be able to query (if you have not used WITH (ONLINE=ON) option). 
Based on my personal experience I personally prefer to “Rebuild Index” on off hours or may be in weekend if table is very big because sometime, it may take few hours or a day.
Now, question comes into the picture that how can I decide the fragmentation of the Index? Well it is fairly very simple, you have to query system function “sys.dm_db_index_physical_stats” and you have to pass your database name and table id in this function and it will return with very crucial information about your indexes on the specified table but it will not return Index name but it will return, Index ID and that is why, we have to make one JOIN of this “sys.dm_db_index_physical_stats” function with “sys.Indexes” system catalog.
SELECT
      sysin.name as IndexName
      ,func.avg_fragmentation_in_percent
FROM
      sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID(N’orders’),NULL, NULL, NULL) AS func
JOIN
      sys.indexes AS sysIn
ON
      func.object_id= sysIn.object_id AND func.index_id = sysIn.index_id;
I wanted to look the index details of my table “Orders” so I used my table name but you can replace the table name you want.
Hope you are now clear when to use rebuild and when to use reorganize. It is very simple, if you see values between 5% to 40% in your “avg_fragmentation_in_percent” field of above query, go for reorganize, if you see >40% then go for rebuild and if less than 5%, get back to your chair and have rest, there is nothing to do in this matter.  LOL
Once you decide whether to do Rebuild or Reorganize, you have use very simple query to do this operation on your table. Have a look at below queries.
–I wanted to reorganize my index, named “idx_refno” on orders table
ALTER INDEX idx_refno ONOrders REORGANIZE
GO
–I wanted to reorganize all my indexes which are there for orders table
ALTER INDEX ALL ON Orders REORGANIZE
GO
–I wanted to REBUILD my index, named “idx_refno” on orders table
ALTER INDEX idx_refno ONOrders REBUILD
GO
–I wanted to REBUILD all my indexes which are there for orders table
ALTER INDEX ALL ON Orders REBUILD
GO
BTW, I have created Index “idx_refno” and “orders” table in one of my previous article of Index, if you want to use the same object, have a look at the table and index script at here.
Do drop your comments about this concept!!!

if you want to refer all other articles related to index, click here.

Happy Indexing!!!
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Do you know Index Statistics in SQL Server?

This concept is based on Index so if you want to clear up your basics about different type of index available in SQL Server, do lookup here.
In order to understand the concept of “Index Statistics”, you must be aware with the concept of “Query Selectivity”. 
If you create an Index on any field in your table and execute “SELECT” query on the same table with the field in “where” clause which you have defined as “Index Key Column”, it is not guarantee that the index you have defined, would be used. 
Question might pops up in your mind that whether SQL Server making any mistakes by not using my Index and scanning all rows? 

No, SQL Server is not making any mistake in this case but it is being smart by choosing the proper, right and efficient way to execute your “SELECT” query.

If your conditions lied in “WHERE” clause, “Having” clause and “JOIN” statements returns almost all rows or majority of rows than optimizer wouldn’t waste the time to call Index, find the desired value from Index and return actual row from table. Rather than doing this, SQL Server would directly scan complete table and return desired output, if optimizer thinks that table scan would take less time than Index call. This is the concept of “Query Selectivity”.
After reading above paragraph, one more question might pops up in your mind again that without even executing query, how do SQL Server take decision to go for Index or not? How do SQL Server know that whether most of the rows would be return by query?
Answer to this question lies under “Index Statistics”.  When you create an index, not only data get sorted and stored in leaf level pages of Index with Pointer or actual data in case of clustered index but it creates “Histogram” too. This histogram tells optimizer that how many rows would be returned by given condition in “SELECT” query. SQL Server would take decision based on the answer of Histogram whether to call Index or not.
We have created one clustered index in one of my previous article here. I am going to use same index, named “idx_refno”, to demonstrate “Histogram”.
–we had orders table in SQLHub data and Index named “idx_refno”
–use the same here
DBCC SHOW_STATISTICS (‘Orders’,‘idx_refno’)
After executing above query, it will return three tables as a result of that query. Look at the below screen shot.
 
1st table would return some important information about Index like Index name, last update date of index etc.
2nd table would return information about Index Key column, its density and length of key column.
3rd table would return information about “Histogram”, SQL Server uses this to decide how many rows would be return based on your “SELECT” query.
For example: you can see last row of third table which has “2057058364” in “Range_HI_KEY” column and “900” in “EQ_Rows” column. It means that, if you pass “2057058364” to your “RefNo” column, you will get “900” rows.  Try it out.
–you might have different values in your refno column
so confirm from your histogram table and change it in WHERE clause below
–before executing this query
SELECT * from orders where refno=2057058364
See the screen shot below:
 
So, do you know Index Statistics in SQL Server, now?

if you want to refer all other articles related to index, click here.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile

Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

List of articles about Index in SQLHub.com

Yesterday evening, while reading my recent Index articles, one of the fellow .NET developer asked me, “Sir, You have mentioned few things about size of index like this index is consuming big space and that index consuming low space, how can I see the size of Indexes and confirm what you have written?”
This was really a very genuine question but the answer was fairly simple T-SQL query on “SysIndexes” catalog view which I have already written but he was not aware with so finally I thought to cover up all my Indexes articles in one post so it would become good reference for some handy stuffs.
Here is the list of all articles I have written so far for the indexes:
4.)    Included column in Index
I will keep updating this page whenever I will write down something about Index.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Filtered Index in SQL Server 2008/Denali

Filtered Index is nothing but just a feature of Non clustered index which I shown in previous two articles. It is just a non clustered index with WHERE clause in simple terms.
It is mainly used while you have big tables and you used to select only subset of data from that table. Like you have one big customer table and have one field of “Reference Person” in that table, it has NULL value if customer directly comes to us and has reference person’s name, if customer came from any of the reference. In this case if you want only those customers list that has reference person so that we can distribute some sort of consolation to those reference people.
The main advantage of “Filtered Index” is, it will have lower amount of root pages to store the data as it will consider only those rows which cater the needs of “Where” clause of “Filtered Index”.
Less number of pages means reduced storage size.  Since “Filtered Index” has only those data in root pages which caters the need of “Where” clause, means when you perform any DML operation like Insert, Delete or Update, “Filtered Index” will get effect only if it affects the Index Key which comes under the “Where” clause of Index so low maintenance cost. 
BTW, you can’t create “Filtered Index” on View but it will surely get benefit of the “Filtered Index” created on base table.
Let us check the impact of  “Filtered Index” practically.
–create one database which you can delete after running this example
create database SQLHub
GO
USE SQLHub
GO
–if orders table is already there. you can delete it than create new one with name “Orders”
IF OBJECT_ID(‘SQLHubFilteredIndex1’,‘U’) IS NOT NULL BEGIN
      DROP TABLE SQLHubFilteredIndex1
END
GO
–creating table
CREATE TABLE SQLHubFilteredIndex1 (ID INT IDENTITY Primary Key Clustered, OrderDate DATETIME, Amount MONEY, Refno INT)
GO
–inserting fack rows into table
INSERT INTO SQLHubFilteredIndex1 (OrderDate, Amount, Refno)
SELECT TOP 100
      DATEADD(minute, ABS(a.object_id % 50000 ), CAST(‘2010-02-01’ AS DATETIME)),
      ABS(a.object_id % 10),
      CAST(ABS(a.object_id) AS VARCHAR)
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
Union All
SELECT TOP 100000
      NULL,
      ABS(a.object_id % 10),
      CAST(ABS(a.object_id) AS VARCHAR)
FROM sys.all_objects a
CROSS JOIN sys.all_objects b
GO
–run the following query with execution plan together and see the results in execution plan
–you can see execution plan with the following steps
–first select both of the below given query
–Press Ctrl+M
–press F5
SELECT * fromSQLHubFilteredIndex1 where OrderDate is not null
CREATE NONCLUSTERED INDEXidx_SQLHubFilteredIndex1 ONSQLHubFilteredIndex1(OrderDate)
WHERE OrderDate is not null
SELECT * fromSQLHubFilteredIndex1 where OrderDate is not null
GO
–if you wish, you can uncomment below code and delete SQLHub database
—-use master
—-go
—-drop database sqlhub
You can see in above screen shot that the same query ran faster after creating index.

if you want to refer all other articles related to index, click here.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah

Included Column Index with non clustered index in SQL Server 2005/2008/Denali

My previous article was based on “Non clustered Index” and this article focuses on “Included Column” feature of Index which was introduced in SQL Server 2005 and still there in newer version.
As long as architecture of “Included Column” concerns, whatever columns you have defined under “Include” clause under the index, those will be stored on the Leaf pages, it won’t get stored on the Root page or Intermediate page of the index.
Now, let us talk little bit about the benefit we are going to get out of this feature.
The main feature is that, the columns you have under “Include” clause of “Create Index” statement would not affect the size of the index. Index has limitation that you can have it on maximum of 16 column / 900 bytes. So no matter how big columns you are going to use in your “Include”, you will get benefit for sure.
You should keep those columns in “Include” clause which generally comes under “SELECT” clause and not being used much in “WHERE”, “GROUP BY” or “ON” clause of “JOIN”.
We are going to create one table with big column size and will try to create Non Clustered Index on that.
–create dummy table and see whether we are able to create index on that or not.
create table NonClustCheck
(
Col1 char(500),
col2 char(400),
col3 char(10)
)
GO
–if you will try following idex, you will greeted with error message as follows:
–Msg 1944, Level 16, State 1, Line 1
–Index ‘idx_NonClustCheck’ was not created. This index has a key length of at least 910 bytes.
–The maximum permissible key length is 900 bytes.
create nonclustered indexidx_NonClustCheck on NonClustCheck(col1,col2,col3)
go
–you can create following index on the same table
create nonclustered indexidx_NonClustCheck1 on NonClustCheck(col1)
INCLUDE(col2,col3)
go
we finally we have created non clustered index with one key column (Col1) and two included column (Col2 and Col3). Let us check whether this index gets scan or seek or optimizer decides not to use this.
–well there is no data in this table yet,
–even just wanted to see whether Non clustered index is having any effect or not.
–run following query with execution plan and you can see Index Seek
— Ctrl + M and than F5 to run query with execution plan.
select * from NonClustCheck where Col1=‘SQLHub.Com’
GO
–you can see Col2 is not as the Index Key,
–even you can see that our non clustered index is getting SCAN.
select * from NonClustCheck where Col2=‘hello’
GO
Hope this will be helpful to you. Do drop comments; it will encourage me for sure.

if you want to refer all other articles related to index, click here.
Reference: Ritesh Shah
http://www.sqlhub.com
Note: Microsoft Books online is a default reference of all articles but examples and explanations prepared by Ritesh Shah, founder of
http://www.SQLHub.com
Ask me any SQL Server related question at my “ASK Profile
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah