SELECT TOP N Rows Per Group/Category

If you would like to select top 'n' rows for each group or category, then here is a query to do so :

-- Create Sample Table
DECLARE @TT table
( ID int,
CategoryGroupID int,
CategoryGroupName varchar(10)
)

-- Create Sample Data
INSERT INTO @TT VALUES ( 1, 1, 'AA')
INSERT INTO @TT VALUES ( 2, 1, 'AA');
INSERT INTO @TT VALUES ( 3, 1, 'AA');
INSERT INTO @TT VALUES ( 4, 2, 'BB');
INSERT INTO @TT VALUES ( 5, 2, 'BB');
INSERT INTO @TT VALUES ( 6, 2, 'BB');
INSERT INTO @TT VALUES ( 7, 2, 'BB');
INSERT INTO @TT VALUES ( 8, 3, 'CC');
INSERT INTO @TT VALUES ( 9, 3, 'CC');
INSERT INTO @TT VALUES ( 10, 3, 'CC');
INSERT INTO @TT VALUES ( 11, 3, 'CC');
INSERT INTO @TT VALUES ( 12, 3, 'CC');
INSERT INTO @TT VALUES ( 13, 4, 'DD');

-- Query to retrieve top 3 items per category
WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY CategoryGroupID ORDER BY ID) AS 'RowNo',
ID, CategoryGroupID, CategoryGroupName
FROM @TT
)
SELECT ID,CategoryGroupID,CategoryGroupName
FROM CTE
WHERE RowNo <= 3

-- Expected Output
1 1 AA
2 1 AA
3 1 AA
4 2 BB
5 2 BB
6 2 BB
8 3 CC
9 3 CC
10 3 CC
13 4 DD

To apply this on a real table, we will select the top 3 products by quantity for each category in the Northwind database

-- TOP 3 Products By Quantity for each Category
USE NORTHWIND;

WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY CategoryName ORDER BY UnitsInStock DESC) AS 'RowNo',
CategoryName, ProductName, UnitsInStock FROM Categories c
INNER JOIN Products p ON c.CategoryID = p.CategoryID
)
SELECT CategoryName, ProductName, UnitsInStock FROM CTE
WHERE RowNo <= 3


About The Author

Suprotim Agarwal, MCSD, MCAD, MCDBA, MCSE, is the founder of DotNetCurry, DNC Magazine for Developers, SQLServerCurry and DevCurry. He has also authored a couple of books 51 Recipes using jQuery with ASP.NET Controls and a new one recently at The Absolutely Awesome jQuery CookBook.

Suprotim has received the prestigous Microsoft MVP award for nine times in a row now. In a professional capacity, he is the CEO of A2Z Knowledge Visuals Pvt Ltd, a digital group that represents premium web sites and digital publications comprising of Professional web, windows, mobile and cloud developers, technical managers, and architects.

Get in touch with him on Twitter @suprotimagarwal, LinkedIn or befriend him on Facebook

19 comments:

Anonymous said...

PARTITION BY was the key I was looking for. Thanks!

Anonymous said...

This was the most lucid, clear explanation of this process. Worked like a charm, and save me hours (and probably a few torn-out hairs).

Thank.s

Suprotim Agarwal said...

I am Glad it helped you out!

phaedrus said...

This is EXACTLY the kind of technique I have been looking for. Thank you very much for sharing it!

Anonymous said...

Thanks
Its so nice, I dindt know about all this,
good one

Anonymous said...

very useful! thanx!

Unknown said...

Thank you so much! I'm working on a project for school and this was so helpful. Great explanation as I just could not wrap my head around some of the other examples I had found!

Anonymous said...

Good job with this.

Unknown said...

Here are some other methods
http://beyondrelational.com/blogs/madhivanan/archive/2008/09/12/return-top-n-rows.aspx

Madhivanan

Anonymous said...

Great post, thanks. I added a variation that allows you to sort the CTE query. Here, I sorted by "Priority"

WITH CTE AS
(
SELECT top 250000 ROW_NUMBER() OVER (PARTITION BY CompanyName ORDER BY ID) AS 'RowNo',
ID, CompanyName, Email, Priority
FROM ABCTABLE
order by priority desc
)

Delete from ABCTABLE where email in (
SELECT Email
FROM CTE
WHERE RowNo > 5)


... This let me take a company where we have multiple contacts and narrow down to the top 5, by priority.

Note: The TOP 250000 - to do the ORDER BY you need a TOP parameter. I picked 250,000 because the table had 175,000 records.

rax said...

if i want to get the results post the query and place into a #tempTable, how would i append the script?

Anonymous said...

realised neeeded to place a select into the last part...

WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY locid ORDER BY loccvgid) AS 'RowNo',
loccvgid, locid
FROM loccvg
)
SELECT
LOCCVGID, locid
into #t
FROM CTE
WHERE RowNo <= 1

Anonymous said...

GENIUS! exactly what I needed and worked perfectly! THANK YOU!!

Anonymous said...

Fantastic, thank you...JUST what I needed.

odanyboy said...

Cheers for that ... I got just what I needed after looking for ages.

Anonymous said...

This is a great, clear, and to the point post. Thanks!

Anonymous said...

After many hours searching and trying 'solutions', I finally managed to find something that actually works and I can easily understand.

Thanks! :)

GirishChhatani said...

is it possible without using Row number function the reason i am looking to implement some other logic bcse Row_number() query which i am firing on a History table which contains lakhs of records... so query takes long time to execute.

So pls help me out...
Thanks & Regards,
Girish Chhatani

GirishChhatani said...

I have a History Table tbl_history which contains crores of Records.

Table Sample Data:-
id Volume DeletedTime
1 C 11am
1 C 12am
1 C 1pm
1 C 2pm
1 C 3pm
1 D 11am
1 D 12am
1 D 1pm
1 D 2pm
1 D 3pm
2 C 11am
2 C 12am
2 C 1pm
2 C 2pm
2 C 3pm
2 D 11am
2 D 12am
2 D 1pm
2 D 2pm
2 D 3pm

Now what i want is to list out last 3 Records per ID and per Category i.e in our Case Volume, without using Row number function the reason i am looking to implement some other logic other than Row_number() because query takes long time to execute.