Comparing UNION and UNION ALL operator in SQL Server

Comparing UNION and UNION ALL operator in SQL Server

It happens many-time that we need a single result set by the combination of two or more result set in SQL Server. UNION operator comes as a handy tool in this situation.

I have often seen that people tend to use UNION though there UNION ALL can work in that situation. Do you think twice before making a choice between UNION and UNION ALL? Or use UNION always, just to save some (three “ALL”) key stroke???

UNION and UNION ALL has different purpose and that is why those have to be used wisely because it affect performance of the overall query.

Before we go further, let us see what UNION and UNION ALL can do for us.

UNION: This operator can combine two or more resultset into one single resultset and remove duplicate records to show distinct result.

UNION ALL: This operator can combine two or more resultset into one single resultset but don’t remove duplicate records to show distinct result and hence if there is any duplication in result sets, you will get more number rows.

There are some rules to follow in order to use either of these operator.

1.) COMPUTE and ORDER BY clause can be applied for over all resultset, not for individual SELECT statement part of UNION operator

2.) GROUP BY and HAVING clause can work with individual SELECT statement, not with overall resultset

3.) Each SELECT statement participated in UNION (ALL) operator suppose to have same number of column with compatible data types.

4.) Column name or alias given in the first SELECT statement would be used as a column heading in final resultset.

Well, after having an idea of UNION and UNION ALL, let us now look at the TSQL script to know how we can practically use it.

Create two sample tables with following script:

[sourcecode language=”sql”]
USE AdventureWorks2012
GO

CREATE TABLE UnionTest1
(
ID INT IDENTITY(1,1)
,Name VARCHAR(25)
)
GO

CREATE TABLE UnionTest2
(
ID INT IDENTITY(1,1)
,Name VARCHAR(25)
)
GO

INSERT INTO UnionTest1 VALUES
(‘Ritesh Shah’),
(‘Rajan Shah’)
GO

INSERT INTO UnionTest2 VALUES
(‘Ritesh Shah’),
(‘Bihag Thakar’),
(‘Bhushan Shah’)
GO
[/sourcecode]

Now execute following two SELECT query (keep your execution plan on query) out of which first query is using UNION and second query is using UNION ALL. Please note that we have total 4 unique name in both table.

[sourcecode language=”sql”]
USE AdventureWorks2012
GO

SELECT Name FROM UnionTest1
UNION
SELECT Name FROM UnionTest2
GO

SELECT Name FROM UnionTest1
UNION ALL
SELECT Name FROM UnionTest2
GO
[/sourcecode]

Here is the resulset I have received by executing above query:

Union ResultSet

UNION did the DISTINCT internally and hence return only 4 unique name in first resultset whereas UNION ALL return all 5 names.

Now, let us have a look at execution plan to find query cost.

UnionExecutionPlan

If you look at the query cost, UNION has taken 73% cost of total batch execution whereas UNION ALL has taken remaining 27% only.

Thus it is proved that UNION is adding overhead and taking more time to execute. UNION should be used as a last resort. If database is properly normalized, you don’t need UNION. If there is a situation that your result set returning duplicate record, you should find a way to eliminate it, if possible, rather then taking a easy solution of using UNION.

If you like this article, do like “Extreme-Advicepage in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Logical Query Processing in SQL Server

processingLogical Query Processing in SQL Server

I am active online community member and try to help more and more people via my blog and some other forums like Experts-Exchange.com (EE), Asp.net, MSDN etc. in my leisure time. I have recently seen one question in EE where one guy was asking that column alias is not working in his WHERE clause. Obviously, column alias of SELECT clause won’t work in WHERE clause as well as in JOIN clause due to execution order. SQL Server has its own way to execute your SELECT query and if you don’t know how it works, you may fall into this kind of issue. After seeing that question in EE, I decided to write this post.

You SELECT query might consists of SELECT, WHERE, JOIN, GROUP BY etc. clauses in it but SQL Server follows the following order to execute your query.

[sourcecode language=”sql”]
FROM
ON
OUTER
WHERE
GROUP BY
ROLLUP | CUBE
HAVING
SELECT
DISTINCT
ORDER BY
TOP
[/sourcecode]

If you can see “WHERE” clause in in above list then you can find it at 4th position whereas SQL Server know about the column alias you have used in 4th from last position. This is the reason that you can’t get column alias in WHERE clause but you do get it in ORDER BY clause. Let us see it practically by executing simple SELECT query in AdventureWorks2012 database.

[sourcecode language=”sql”]

USE AdventureWorks2012

GO

SELECT

SalesOrderID AS SOID

,SalesOrderDetailID AS SODID

,CarrierTrackingNumber

,OrderQty

FROM

sales.SalesOrderDetail

WHERE

SOID=43659

–you will be greeted with error something like this:
–Msg 207, Level 16, State 1, Line 2
–Invalid column name ‘SOID’.
[/sourcecode]

Following query would work as we have proper column name in WHERE clause whereas alias in ORDER BY clause.

[sourcecode language=”sql”]
USE AdventureWorks2012
GO

SELECT
SalesOrderID AS SOID
,SalesOrderDetailID AS SODID
,CarrierTrackingNumber
,OrderQty
FROM
sales.SalesOrderDetail
WHERE
SalesOrderID=43659
ORDER BY SOID
[/sourcecode]

If you like this article, do like “Extreme-Advicepage in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

CXPACKET wait stats in SQL Server

WaitStatCXPACKET wait stats in SQL Server

“Wait” is one of the very familiar word in human’s life. Nobody likes to wait though it is not always bad to wait for a while for something good, at the same time long unnecessary waiting period resulted into frustration.

When I am about to finish my day in office, I always gets eager to meet my 2 year old son “Teerth”. I enjoy this wait period, if I would be with him for day, I wouldn’t get pleasure of meeting him at the end of day, though I enjoy his company whole day.

I don’t like to wait for upload/download something due to slow internet speed, it is unproductive and unnecessary wait time. Who likes to wait for 1 minute video on YouTube.com gets 5 minutes to load (streaming)?

This is how I evaluate wait period and SQL Server apply almost same kind of rule. It is up-to you to know wait stat, identify whether it is good (necessary) or bad and resolve it if possible.

In simple words, you would like to see result of your request immediately. You expect, you execute query in SQL Server and it finishes operation immediately. In normal scenario SQL Server even do it but there are many cases and real time scenario which wouldn’t let operation go smooth.

There are so many reasons which makes your query waiting, whether it is memory issue, CPU pressure, blocking, locking, dead locking to name a few among so many.

There is one DMV which you can use to monitor wait stats. “sys.dm_os_wait_stats” is your weapon to deal with wait stats.

You can execute simple SELECT query on “sys.dm_os_wait_stats” to see all wait stats along with the value in your system.

Yesterday I got an email from one of the reader this blog, he was asking that he is facing big values in CXPACKET wait stats. Is it good or bad? If bad, how can I can resolve it? This email from the reader insist me to write this blog post.

As per MSDN, CXPACKET occurs when trying to synchronize the query processor exchange iterator. You may consider lowering the degree of parallelism if contention on this wait type becomes a problem. To find detail about different wait stats, click here.

Generally parallelism is being used to execute your process so for example, if you have executed SELECT query and four processor is executing your SELECT query now three processor has finished their part of job but those processors can’t show you the result until last one processor return the result because request result of SELECT query can only be completed if all four processors return result and clubbed together. Since your last processor was still working, three processors had to wait which can raise CXPACKET wait stat.

To overcome this problem, you can decrease value of MAXDOP or Max degree of parallelism. I have written article on this topic earlier which you can refer by following link:

““Max degree of parallelism” or “MAXDOP”, good guy or bad guy?

Note: This is just a generic advice to handle CXPACKET wait type. For more accurate implementation, need to study the environment in detail. Generally in Online Transactional Pr0cessing (OLTP) should have low number of value (near to zero) for MAXDOP as it used to have quick and short transaction and Online Analytical Processing (OLAP) should have higher number of MAXDOP.

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Backup and Restore Filegroup in SQL Server

Backup and Restore Filegroup in SQL Server

Backup and Restore of database is one of the primary task of any DBA. Taking up full backup of database at regular interval and restore it whenever and wherever it is needed is very popular but it may not be possible for every database. Some production databases are too big and we can’t take full backup of database frequently and in this scenario, file / filegroup backup comes to rescue.

If you are operating database with size of few thousand GBs, It becomes impractical to take full database backup and restore full database backup as it might take few hours to complete this operation. Fortunately SQL Server 2005 and later Edition provides an answer to this challenge through piecemeal restore.

I will be using AdventureWorks2012 database with FULL recovery mode in my testing server. Adventureworks2012 database comes up with one default filegroup which is “Primary” but I will add one more File with new filegroup to demonstrate this exercise.

[sourcecode language=”sql”]
USE [master]
GO
–Add New Filegroup
ALTER DATABASE [AdventureWorks2012] ADD FILEGROUP [NewFileGroup]
GO

USE [master]
GO
–Add new data file with .NDF extension
ALTER DATABASE [AdventureWorks2012] ADD FILE
( NAME = N’AdvNewFile’, FILENAME = N’E:\SQL2012DataFiles\AdvNewFile.ndf’
, SIZE = 5120KB , FILEGROWTH = 1024KB ) TO FILEGROUP [NewFileGroup]
GO

USE [AdventureWorks2012]
GO
–create new table in newly created filegroup
CREATE TABLE [dbo].[EmpOnNewFileGroup](
[ID] [int] IDENTITY(1,1) NOT NULL,
[empName] [varchar](100) NULL
) ON NewFileGroup
GO
[/sourcecode]

Now, let us confirm the total number of database files / filegroup along with its path and size with simple TSQL script which was given in my earlier article too.

[sourcecode language=”sql”]
USE [AdventureWorks2012]
GO
SELECT
dbfile.name AS DatabaseFileName,
dbfile.size/128 AS FileSizeInMB,
sysFG.name AS FileGroupName,
dbfile.physical_name AS DatabaseFilePath
FROM
sys.database_files AS dbfile
INNER JOIN
sys.filegroups AS sysFG
ON
dbfile.data_space_id = sysFG.data_space_id
GO
[/sourcecode]

We should get two filegroup for Adventureworks2012.

Now, take backup of “Primary” filegroup with following command.

[sourcecode language=”sql”]
BACKUP DATABASE AdventureWorks2012
FILEGROUP = ‘Primary’
TO DISK = ‘E:\Temp_Testing\Adv2012P.bak’;
GO
[/sourcecode]

Inserting one record in newly created table.

[sourcecode language=”sql”]
INSERT INTO [dbo].[EmpOnNewFileGroup]
SELECT ‘Ritesh Shah’
GO
[/sourcecode]

Now, we are going to take backup of new filegroup we have created along with log backup.

[sourcecode language=”sql”]
–Back up the files in newFileGroup we have just created.
BACKUP DATABASE AdventureWorks2012
FILEGROUP = ‘NewFileGroup’
TO DISK = ‘E:\Temp_Testing\adv2012N1.bak’;
GO

BACKUP LOG AdventureWorks2012
TO DISK =’E:\Temp_testing\adv2012Log.bak’
GO
[/sourcecode]

We have crossed 50% benchmark now and we are on the half way. We are done with backup stuff but we have to make sure that we are able to restore backed up database whenever we want so I will be creating new database “Adventureworks2012New”, one the same SQL Server instance , by using the backup we have taken. Do remember that I don’t have any full backup at the moment.

[sourcecode language=”sql”]
USE [master]
GO

–restoring Primary filegroup backup with PARTIAL and NORECOVERY (offline restore)
RESTORE DATABASE [AdventureWorks2012New]
FILEGROUP= ‘Primary’
FROM DISK= ‘E:\Temp_Testing\Adv2012P.bak’
WITH PARTIAL, NORECOVERY,
MOVE N’AdventureWorks2012_Data’ TO N’E:\SQL2012DataFiles\Temp\AdventureWorks2012New_Data.mdf’,
MOVE N’AdventureWorks2012_Log’ TO N’E:\SQL2012DataFiles\Temp\AdventureWorks2012New_Log.mdf’
GO

–retoring new filegroup backup with NORECOVERY
RESTORE DATABASE [AdventureWorks2012New]
FILE = N’AdvNewFile’ FROM DISK = N’E:\Temp_Testing\adv2012N1.bak’
WITH NORECOVERY,
MOVE N’AdvNewFile’ TO N’E:\SQL2012DataFiles\Temp\AdvNewFileNew.ndf’
GO

–LOG restore with recovery
RESTORE LOG [AdventureWorks2012New]
FROM DISK =’E:\Temp_testing\adv2012Log.bak’
WITH RECOVERY
[/sourcecode]

Now, let us check whether one record in the newly created table is restored or not? I am executing SELECT query on that table from both databases, I have following result:

Filegroup

So, this is how we can restore filegroup.

If you like this article, do like “Extreme-Advicepage in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Get information about filegroup, data file name and path along with size in SQL Server

Get information about filegroup, data file name and path along with size in SQL Server

As a DBA, it happens quite a few times that I have to find total number of file groups available for my database along with the file that particular file group is associated with. I may also need to know the physical path of that database file along with the size of data file.

I can query “Sys.Database_files” and “sys.FileGroups” system table to find this important information about my database.

Here is small yet useful query I used to execute in order to get said information.

[sourcecode language=”sql”]
SELECT
dbfile.name AS DatabaseFileName,
dbfile.size/128 AS FileSizeInMB,
sysFG.name AS FileGroupName,
dbfile.physical_name AS DatabaseFilePath
FROM
sys.database_files AS dbfile
INNER JOIN
sys.filegroups AS sysFG
ON
dbfile.data_space_id = sysFG.data_space_id
[/sourcecode]

Here is the output of above query in my laptop:

FileGroup

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Search SQL Server Job based on keyword provided

MonitoringSearch SQL Server Job based on keyword provided

We have hundreds of system as well as customize job setup in SQL Server Agent for our production databases. It is hard to find one particular job out of this long list. We tend to maintain document for each job we set but I always try to find solution right from the SSMS if it is a question of SQL Server.

It happens many time that I got SP name and have to find SQL Server Job(s) which is calling that particular stored procedure. It is not at all feasible for me to go through each and every SQL Server Job, open it’s step and find SP/Table is being called from it.

Following small SELECT,  based on SysJobs and SysJobSteps system view , script always come as a rescue in this kind of situation.

[sourcecode language=”sql”]

USE MSDB
GO

SELECT
JobStep.Database_Name,
SysJob.name AS Job_Name,
JobStep.command,
JobStep.step_id,
JobStep.Last_Run_Date,
JobStep.Last_Run_Time,
JobStep.Last_Run_Duration
FROM
sysjobs AS SysJob
INNER JOIN
sysjobsteps JobStep
ON
SysJob.job_id = JobStep.job_id
WHERE
JobStep.command like ‘%You_SP_or_Table_Name%’
GO
[/sourcecode]

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Mirror backup to take full backup at multiple location in SQL Server

Mirror backup to take full backup at multiple location in SQL Server

Sometime back, one company environmental invited me to investigate some performance related issues there were facing. While investigating issues, I seen their backup strategy. They were taking full backup everyday as they were having database near to 150GB and after taking full backup, they were transferring full backup to external drive as well as in network path by using xcopy command in batch file. This incident forced me to write this article.

Recently we have talked about splitting full database backup in multiple file in one of my previous article. Today I want to show one more useful command “Mirror” for database full backup which helps DBA to copy database full backup file to different location.

Generally we tend to take full backup at regular interval and copy it somewhere else, may be in network location or may be in external drive attached to server. If you are copying backup manually to another location, “Mirror” command is helpful to you, if you have SQL Server 2005 or later version.

All you need to manage is permission. SQL Server service account/login suppose to have proper permission on the drive/network path to write something there.

I am talking full database backup of my AdventureWorks2012 database into local “E” drive as well as at network path. My SQL Server service account has full permission for write in my network path.

[sourcecode language=”sql”]

USE [master]

GO

BACKUP DATABASE AdventureWorks2012

TO DISK = ‘E:\Temp_Testing\AdventureWorks2012.bak’

MIRROR TO DISK = ‘\\WinServ2008R2\DBBackupTesting\AdventureWorks2012.bak’

WITH FORMAT

GO

[/sourcecode]

After executing the above command, I come across following message:

MIrror

Keep updating yourself with technology enhancement will always makes you more productive and efficient.

Knowledge is power!!!!

some more reading about backup & restore:

Find database backup history in SQL Server (Click Here)
Find database restore date in SQL Server (Click Here)
Backup SQL Server database with encrypted data column/table and restore it somewhere else (Click Here)
Split SQL Server Full backup in multiple files for faster backup (Click Here)

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Split SQL Server Full backup in multiple files for faster backup

Split SQL Server Full backup in multiple files for faster backup

I had a situation yesterday to download one full database backup from the data-center to our local network. I have taken compressed backup of our database which was around 90 GB and download 90 GB over internet is still a big thing. I had no other choice but to download only.

We have our own FTP between our data-center and our office and I had to use it only. There is a chance that after downloading few GBs If I get a single packet loss in internet, I would have to start from zero all over again. I was not in situation to install DAP or this kind of software which provides “resume” service if downloading gets stopped due to any reason.

So, I have decided to split backup files in 500MB chunk so at any give time, If I lost internet connection, I loss maximum 500MB data and I can afford to start it again. Think If I start downloading of 90 GB and internet lost for few seconds after I completed downloading of 85GB, I would not be in position to spend that much time again.

Splitting backup in multiple files is not only convenient for storing and transferring but it is faster as well. Let us see how we can take SQL Server backup in multiple files.

Herewith, I am going to take full backup of Adventureworks2012 database in one .BAK file only.

[sourcecode language=”sql”]
SET STATISTICS TIME ON
BACKUP DATABASE AdventureWorks2012
TO DISK = ‘E:\SQL2012DataFiles\Adventureworks2012.bak’
SET STATISTICS TIME OFF
GO
[/sourcecode]

Let us see how much time SQL Server has taken to get this backup in one file.

SingleBackup

Now let us take backup in multiple files:

[sourcecode language=”sql”]
SET STATISTICS TIME ON

BACKUP DATABASE AdventureWorks2012
TO
DISK = ‘E:\SQL2012DataFiles\Adventureworks2012_1.bak’
,DISK = ‘E:\SQL2012DataFiles\Adventureworks2012_2.bak’
,DISK = ‘E:\SQL2012DataFiles\Adventureworks2012_3.bak’

SET STATISTICS TIME OFF
GO
[/sourcecode]

I have taken backup of Adventureworks2012 database in three files, here is the statistics.

MultipleBackup

You can see backup in multiple file is faster then previous attempt. However, you can’t see major difference as this database is small, you can see major difference if you are having database with few GBs.

Now, it would be interesting to see, how we can restore database from multiple files. Here is the script for the same:

[sourcecode language=”sql”]
USE [master]
GO

RESTORE DATABASE [AdventureWorks2012]
FROM
DISK = N’E:\SQL2012DataFiles\AdventureWorks2012_1.bak’
, DISK = N’E:\SQL2012DataFiles\AdventureWorks2012_2.bak’
, DISK = N’E:\SQL2012DataFiles\AdventureWorks2012_3.bak’
WITH FILE = 2, NOUNLOAD, STATS = 5
GO
[/sourcecode]

some more reading about backup & restore:

Find database backup history in SQL Server (Click Here)
Find database restore date in SQL Server (Click Here)
Backup SQL Server database with encrypted data column/table and restore it somewhere else (Click Here)

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Search objects like Stored Procedure, View, Table in SQL Server

SearchSearch objects like Stored Procedure, View, Table in SQL Server

It happens many time that we don’t know the exact name of stored procedure or table or view etc. and we have to either look at object explorer and scroll all the way down for object we are looking for or we can use “Filter” facility of SSMS but I prefer one small function I have written very long back because I am a script buddy and try to avoid mouse as long as possible.

Sometime, we may want to find out whether function, table or view we are looking for, is used in any other SP or View to check dependency, I used the same function. BTW, you can use “SP_Depends” or “Information_Schema.routines” to find dependency of the object which is more accurate way for find dependency.

Here is the table valued user defined function which I have developed a long back and still using it. In fact, I am so used to it now.

[sourcecode language=”sql”]

CREATE FUNCTION ObjectSearch

(

@SearchString VARCHAR(100)

)

RETURNS TABLE

RETURN

SELECT

DISTINCT scm.name AS SchemaName,

obj.name AS ObjectName,

obj.type_desc AS ObjectType

FROM sys.objects obj

INNER JOIN syscomments sysCom ON sysCom.Id = obj.object_id

INNER JOIN sys.schemas scm ON scm.schema_id = obj.schema_id

WHERE

is_ms_shipped = 0 AND sysCom.text like ‘%’ + @SearchString + ‘%’

GO

[/sourcecode]

Okay, not we have “ObjectSearch” function in our database. I have created this function in my AdventureWorks2012 database and now I am going to find where “Department” table is used so I would execute following simple SELECT statement with “ObjectSearch” function.

[sourcecode language=”sql”]

SELECT * FROM ObjectSearch(‘Department’)

GO

[/sourcecode]

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.

Create table with Filestream column in SQL Server 2012

Create table with Filestream column in SQL Server 2012

Earlier I have written what Filestream is and how it is useful to us. I have enabled filestream in SQL Server instance and created one database with filestream filegroup. I am going to use same “ExtremeAdviceFileStream” to demonstrate this article. If you don’t have database with filestream filegroup, have a look at my earlier article and get that very short script.

Here is the table which will create one table with column having Filestream attribute.

[sourcecode language=”sql”]
USE ExtremeAdviceFileStream
GO

CREATE TABLE Customers
(
ID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE,
Name varchar(25),
CustPhoto VARBINARY(MAX) FILESTREAM
)
GO
[/sourcecode]

Now let us insert one record in above created table with following script

[sourcecode language=”sql”]
–I have already kept “Ritesh-Teerth.JPG” file in my D drive,
–you have to provide your own path and file name here.
INSERT INTO Customers
VALUES
(NEWID(),’Ritesh Shah’,(select * FROM OPENROWSET(BULK ‘D:\Ritesh-Teerth.JPG’, SINGLE_BLOB) AS img))
GO
[/sourcecode]

Now, if I execute simple SELECT statement in Customers table, it would looks like this screen capture.

1SELECTQuery

Please note that you may get different value in ID as well as in CustPhoto fields.

Since we have created our database “ExtremeAdviceFileStream” in D drive, let us see what we can find in Filestream folder.

2FilestreamFolder

I have one file in my Filestream folder. I have inserted “Ritesh-Teerth.JPG” file in as a customer image in “Customers” table. I you want to confirm, you can open this file in Paintbrush but this is find just confirm this in development or test environment. This type of manual editing of Filestream datafile may resulted in severe damage in production database so I would like you not to open any file manually in your production server.

Well, this is how you can enable filestream. I will cover few exercises for filestream in coming articles very soon.

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Note: Microsoft Books online is a default reference of all articles.