September 06, 2008

SELECT TOP N Rows Per Group/Category

If you would like to select top 'n' rows for each group or category, then here is a query to do so :

-- Create Sample Table
DECLARE @TT table
( ID int,
CategoryGroupID int,
CategoryGroupName varchar(10)
)

-- Create Sample Data
INSERT INTO @TT VALUES ( 1, 1, 'AA')
INSERT INTO @TT VALUES ( 2, 1, 'AA');
INSERT INTO @TT VALUES ( 3, 1, 'AA');
INSERT INTO @TT VALUES ( 4, 2, 'BB');
INSERT INTO @TT VALUES ( 5, 2, 'BB');
INSERT INTO @TT VALUES ( 6, 2, 'BB');
INSERT INTO @TT VALUES ( 7, 2, 'BB');
INSERT INTO @TT VALUES ( 8, 3, 'CC');
INSERT INTO @TT VALUES ( 9, 3, 'CC');
INSERT INTO @TT VALUES ( 10, 3, 'CC');
INSERT INTO @TT VALUES ( 11, 3, 'CC');
INSERT INTO @TT VALUES ( 12, 3, 'CC');
INSERT INTO @TT VALUES ( 13, 4, 'DD');

-- Query to retrieve top 3 items per category
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY CategoryGroupID ORDER BY ID) AS 'RowNo',
ID, CategoryGroupID, CategoryGroupName
FROM @TT
)
SELECT ID,CategoryGroupID,CategoryGroupName
FROM CTE
WHERE RowNo <= 3

-- Expected Output
1 1 AA
2 1 AA
3 1 AA
4 2 BB
5 2 BB
6 2 BB
8 3 CC
9 3 CC
10 3 CC
13 4 DD

To apply this on a real table, we will select the top 3 products by quantity for each category in the Northwind database

-- TOP 3 Products By Quantity for each Category
USE NORTHWIND;

WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY CategoryName ORDER BY UnitsInStock DESC) AS 'RowNo',
CategoryName, ProductName, UnitsInStock FROM Categories c
INNER JOIN Products p ON c.CategoryID = p.CategoryID
)
SELECT CategoryName, ProductName, UnitsInStock FROM CTE
WHERE RowNo <= 3


About The Author

Suprotim Agarwal, ASP.NET Architecture MVP works as an Architect Consultant and provides consultancy on how to design and develop Web applications.

Suprotim is also the founder and primary contributor to DevCurry, DotNetCurry and SQLServerCurry. He has also written an EBook 51 Recipes using jQuery with ASP.NET Controls.

Follow him on twitter @suprotimagarwal

19 comments:

Anonymous said...

PARTITION BY was the key I was looking for. Thanks!

Anonymous said...

This was the most lucid, clear explanation of this process. Worked like a charm, and save me hours (and probably a few torn-out hairs).

Thank.s

Suprotim Agarwal said...

I am Glad it helped you out!

phaedrus said...

This is EXACTLY the kind of technique I have been looking for. Thank you very much for sharing it!

Anonymous said...

Thanks
Its so nice, I dindt know about all this,
good one

Anonymous said...

very useful! thanx!

Chris said...

Thank you so much! I'm working on a project for school and this was so helpful. Great explanation as I just could not wrap my head around some of the other examples I had found!

Anonymous said...

Good job with this.

Madhivanan said...

Here are some other methods
http://beyondrelational.com/blogs/madhivanan/archive/2008/09/12/return-top-n-rows.aspx

Madhivanan

Anonymous said...

Great post, thanks. I added a variation that allows you to sort the CTE query. Here, I sorted by "Priority"

WITH CTE AS
(
SELECT top 250000 ROW_NUMBER() OVER (PARTITION BY CompanyName ORDER BY ID) AS 'RowNo',
ID, CompanyName, Email, Priority
FROM ABCTABLE
order by priority desc
)

Delete from ABCTABLE where email in (
SELECT Email
FROM CTE
WHERE RowNo > 5)


... This let me take a company where we have multiple contacts and narrow down to the top 5, by priority.

Note: The TOP 250000 - to do the ORDER BY you need a TOP parameter. I picked 250,000 because the table had 175,000 records.

rax said...

if i want to get the results post the query and place into a #tempTable, how would i append the script?

Anonymous said...

realised neeeded to place a select into the last part...

WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY locid ORDER BY loccvgid) AS 'RowNo',
loccvgid, locid
FROM loccvg
)
SELECT
LOCCVGID, locid
into #t
FROM CTE
WHERE RowNo <= 1

Anonymous said...

GENIUS! exactly what I needed and worked perfectly! THANK YOU!!

Anonymous said...

Fantastic, thank you...JUST what I needed.

odanyboy said...

Cheers for that ... I got just what I needed after looking for ages.

Anonymous said...

This is a great, clear, and to the point post. Thanks!

Anonymous said...

After many hours searching and trying 'solutions', I finally managed to find something that actually works and I can easily understand.

Thanks! :)

GirishChhatani said...

is it possible without using Row number function the reason i am looking to implement some other logic bcse Row_number() query which i am firing on a History table which contains lakhs of records... so query takes long time to execute.

So pls help me out...
Thanks & Regards,
Girish Chhatani

GirishChhatani said...

I have a History Table tbl_history which contains crores of Records.

Table Sample Data:-
id Volume DeletedTime
1 C 11am
1 C 12am
1 C 1pm
1 C 2pm
1 C 3pm
1 D 11am
1 D 12am
1 D 1pm
1 D 2pm
1 D 3pm
2 C 11am
2 C 12am
2 C 1pm
2 C 2pm
2 C 3pm
2 D 11am
2 D 12am
2 D 1pm
2 D 2pm
2 D 3pm

Now what i want is to list out last 3 Records per ID and per Category i.e in our Case Volume, without using Row number function the reason i am looking to implement some other logic other than Row_number() because query takes long time to execute.