May 03, 2009

Select Random Records From a Table using TableSample in SQL Server 2005/2008




In my previous post, I had explained how to use the NEWID() to Delete Random Records From a Table Using SQL Server 2005/2008

In this post, I will show how to select random records using the TABLESAMPLE clause and use it in different ways. The TABLESAMPLE clause takes a parameter that can be a percent or a sample number representing rows. This clause samples a percentage of
pages randomly. You will get a different result set each time you run a query with the TABLESAMPLE clause. That means that even if you specify to return 50 rows, you will only get an approximate number of rows, that can be less or more than 50 due to the random sampling logic.

Select X% of Records

Returns different set of rows each time the query is executed


USE Northwind


GO


SELECT * FROM Customers


TABLESAMPLE (50 PERCENT);




Select X% of Records with same random sampling multiple times

Specify REPEATABLE with seed. REPEATABLE clause returns same set of rows every time that it is executed


USE Northwind


GO


SELECT * FROM Customers


TABLESAMPLE (50 PERCENT)


REPEATABLE (100)




Specify Number of Records instead of %(Percentage)


USE Northwind


GO


SELECT * FROM Customers


TABLESAMPLE (50 ROWS)




Important Points about TABLESAMPLE:

1. If you have upgraded from SQL 2000 to SQL Server 2005, make sure that the compatibility level of the database is set to at least 90.

2. TABLESAMPLE cannot be used on derived tables, OPENXML, Inline Table-Valued functions, linked server tables or a View.


Read more on TABLESAMPLE over here http://msdn.microsoft.com/en-us/library/ms189108(SQL.90).aspx


Did you like this post?
kick it on DotNetKicks.com
subscribe via rss subscribe via e-mail
print this post follow me on twitter


About The Author

Suprotim Agarwal, ASP.NET Architecture MVP works as an Architect Consultant and provides consultancy on how to design and develop Web applications.

Suprotim is also the founder and primary contributor to DevCurry, DotNetCurry and SQLServerCurry. He has also written an EBook 51 Recipes using jQuery with ASP.NET Controls.

Follow him on twitter @suprotimagarwal

 
  Feedback:

comments

3 Responses to "Select Random Records From a Table using TableSample in SQL Server 2005/2008"
  1. IGotYourDotnet said...
    April 27, 2010 at 8:13 AM

    how can you pull random records out but not duplicate any? Meaning,I have records that can have the same name but different dates or states such as
    John Smith 1/2/2010 DE
    John Smith 5/14/2008 NJ

    how can I only show one name within my random so my users don't get confused?

  2. Madhivanan said...
    April 28, 2010 at 8:25 AM

    IGotYourDotnet,

    One possible methods is

    select name,min(date) as date from table
    tablesample (50 percent)
    group by name

    Also note that tablesample doesn't work well for small set of data
    If all data of table stored in a single page, either it returns all rows or no rows

    Run this to understand

    SELECT
    city,min(customerid) FROM Northwind..Customers
    group by city It returns 69 rows

    Now

    SELECT city,min(customerid) FROM Northwind..Customers
    TABLESAMPLE (50 percent)
    group by city

    will return all 69 rows or no rows

    So better approach is to use top operator with newid()

    SELECT
    top 50 percent city,min(customerid) FROM Northwind..Customers
    group by city
    order by newid()

  3. Libo said...
    May 1, 2010 at 8:39 PM

    Excellent tip. I vote for the top operator with newid()

 

Copyright © 2009-2014 All Rights Reserved for SQLServerCurry.com by Suprotim Agarwal | Terms and Conditions