Use of RANK() and Partition by clause in SQL-Server 2005

Use of RANK() and Partition by clause in SQL-Server 2005
We will start by creating one table for demonstration and enter some records in it.

[sourcecode language=”sql”]
–Create one table
CREATE TABLE BlogCount
(
BloggerName VARCHAR(10),
Topic VARCHAR(15),
[Year] INT,
Total INT
)
–Insert records in above table
INSERT INTO BlogCount VALUES(‘Ritesh’,‘SQL’,2005,10)
INSERT INTO BlogCount VALUES(‘Ritesh’,‘SQL’,2006,17)
INSERT INTO BlogCount VALUES(‘Ritesh’,‘SQL’,2007,124)
INSERT INTO BlogCount VALUES(‘Ritesh’,‘SQL’,2008,124)
INSERT INTO BlogCount VALUES(‘Ritesh’,‘.NET’,2008,24)
INSERT INTO BlogCount VALUES(‘Alka’,‘SQL’,2007,14)
INSERT INTO BlogCount VALUES(‘Alka’,‘.NET’,2007,18)
INSERT INTO BlogCount VALUES(‘Alka’,‘SQL’,2008,14)
[/sourcecode]

Once, you are done with above task, let’s think of usability of RANK() and Partition by clause in our real world. Suppose you wish to get record set of all the blogger with their highest total of blog. What will you do? You may use sub-query or/and GROUP BY clause. Right???? Doesn’t it seems tedious and bit difficult? Here is easy solution for the same.
First we will use Rank() and Partition By clause and then we will filter our record set. Partition is like Group by, we want list sorted based on “BloggerName” so we will be putting it in Partition by clause and Order by clause in below query will give Ranking 1 to highest “Total”.

[sourcecode language=”sql”]
SELECT BloggerName,Topic,[Year],Total,
Rank() OVER (Partition by BloggerName Order by Total DESC) as ‘Ranking’
FROM
BlogCount
[/sourcecode]

After running above query you will get result set like:
BloggerName Topic Year Total Ranking
———– ————— ———– ———– ——————–
Alka SQL 2007 18 1
Alka .NET 2007 18 1
Alka SQL 2008 14 3
Ritesh SQL 2007 124 1
Ritesh SQL 2008 124 1
Ritesh .NET 2008 24 3
Ritesh SQL 2006 17 4
Ritesh SQL 2005 10 5
(8 row(s) affected)
Now, we may like to see only highest total article by bloggers, we can’t put direct “where condition” in above query so we will wrap it up like:
[sourcecode language=”sql”]
SELECT *
FROM
(
SELECT BloggerName,Topic,[Year],Total,
Rank() OVER (Partition by BloggerName Order by Total DESC) as ‘Ranking’
FROM
BlogCount
) Testing
where Ranking<2
[/sourcecode]

Here is the output:
BloggerName Topic Year Total Ranking
———– ————— ———– ———– ——————–
Alka SQL 2007 18 1
Alka .NET 2007 18 1
Ritesh SQL 2007 124 1
Ritesh SQL 2008 124 1
(4 row(s) affected)
See above result set, we got only 4 records out of 9, which have the highest “Total”. See “Ritesh’s” record # 3 and #4. Both belongs to SQL and has same total so in that case, I may wish to see only one records of year 2008. In this case, year is a tie breaker. Use below given query for breaking a tie.
[sourcecode language=”sql”]
SELECT *
FROM
(
SELECT BloggerName,Topic,[Year],Total,
Rank() OVER (Partition by BloggerName Order by Total DESC, [year] DESC) as ‘Ranking’
FROM
BlogCount
) Testing
where Ranking<2
[/sourcecode]

Now, you will get following output.
BloggerName Topic Year Total Ranking
———– ————— ———– ———– ——————–
Alka SQL 2007 18 1
Alka .NET 2007 18 1
Ritesh SQL 2008 124 1
 (3 row(s) affected)
How easy is this???? Much much better than using sub-query and group by.

If you like this article, do like “Extreme-Advice” page in Facebook.

Reference: Ritesh Shah

http://Extreme-Advice.com

http://www.sqlhub.com

Sharing IT tips at “Quick Learn

Note: Microsoft Books online is a default reference of all articles.

Delete-Select-Duplicate-Records-by-CTE-in-SQL-SERVER 2005

Delete Duplicate records from table with CTE. As I explain in my previous article that selecting and deleting duplicate records, could be done by several ways so this is another way to do the same.
You can refer earlier article by following link.
http://blog.extreme-advice.com/2009/02/28/select-delete-duplicate-records-sql-server/
As this is going to be done by CTE, You can get more information about CTE by following link.
Once you create table given in above link, you can use the below given query for selecting or deleting records.
WITH DeleteDuplicateRecords AS
(
select Fname,id,Row_number() over (partition by Fname,Lname order by Fname,Lname) as RowNum
from SelectDuplicate
)
Select * from DeleteDuplicateRecords where RowNum>1
Or
WITH DeleteDuplicateRecords AS
(
select Fname,id,Row_number() over (partition by Fname,Lname order by Fname,Lname) as RowNum
from SelectDuplicate
)
DELETE from DeleteDuplicateRecords where RowNum>1
Reference: Ritesh Shah
Microsoft SQL Server Blog. Fight the fear of SQL with SQLHub.com. Founder is Ritesh Shah