Add an Identity to an Existing Column in large SQL Server tables

Suppose you have a table with a large amount of data without an identity column and you want to add an IDENTITY property to the existing column. It is important to note that in SQL Server you cannot alter a column to have an identity property. So how do we implement this?

There are three options.

Consider the following dataset:

create table testing (id int not null, value decimal(12,2))

insert into testing(id,value)
select 1,rand()*10000 union all
select 2,rand()*10000 union all
select 3,rand()*10000 union all
select 4,rand()*10000 union all
select 5,rand()*10000

Method 1: Add a new column with IDENTITY property:

alter table testing add id_new int identity(1,1);

Now the table testing will have data with identity values and you can just use this identity column for any references.

Method 2: Create a new table with identity column and move the data over there

create table testing1 (id int identity(1,1), value decimal(12,2))

Insert into testing1(value)
select value from testing;

Now the table testing1 will have data with identity values and just like with Method 1, you can use this identity column for any reference.

Method 3: Use ALTER TABLE SWITCH option

Create a new table with the same schema as that of the table testing but with IDENTITY property

create table testing2 (id int identity(1,1), value decimal(12,2))

Now use the SWITCH option to move data to this table

alter table testing switch to testing2;

Now all the data is available in testing2 which has an identity column.

Now just drop testing table and rename testing2 to testing

drop table testing;
EXEC sp_rename 'testing2','testing'

Now the table testing has an IDENTITY column with all the data.

Now the question is "which method is the best to use?"

Well, Methods 1 and 2 may be time consuming if the source table has millions of rows. Method 3 is very fast as it switches the partition from one table to another very quickly. So this method will take only few seconds compared to the other two methods.

Caveats: You may need to take care of primary key - foreign key relations before using any of these methods.


About The Author

Madhivanan,an MSc computer Science graduate from Chennai-India, works as a works as a Lead Subject Matter Expert at a company that simplifies BIG data. He started his career as a developer working with Visual Basic 6.0, SQL Server 2000 and Crystal Report 8. As years went by, he started working more on writing queries in SQL Server. He now has good level of knowledge in SQLServer, Oracle, MySQL and PostgreSQL as well. He is also one of the leading posters at www.sqlteam.com and a moderator at www.sql-server-performance.com. His T-sql blog is at http://beyondrelational.com/blogs/madhivanan

3 comments:

Unknown said...

I would add reseeding to make sure new identity values will not start from 1

DBCC CHECKIDENT (testing, reseed, 6)

Unknown said...

Inserting new records after this will bring identity values starting 1. To prevent it I would add reseeding identity value.

DBCC CHECKIDENT (testing, reseed, 6)

Unknown said...

You are absolutely correct viktor tischenko. We need to reset the identity number for the last method