May 15, 2016

Nested Case Statement in SQL Server

This article is a continuation of SQL Server CASE Statement and CASE WHEN Examples . Today we will learn about Nested Case Statement in SQL Server.

We can nest CASE statements similar to nested ifs that we find in most programming languages.

Let us see an example.

select ename, job, sal, case  -- Outer Case
    when ename like 'A%' then
    case when sal >= 1500 then 'A' -- Nested Case
    end
    when ename like 'J%' then
    case when sal >= 2900 then 'J' -- Nested Case
    end
end as "Name-Grade"            
From Emp

7-nested-case
Image 7-Nested-Case

Limit of nesting a CASE function is up to 10 levels only.

In the following example, the limit of 10 is completely utilized.

Declare @x int
set @x = 1
Select 
case when @x <= 100 then  -- Level 1
 case when @x <= 90 then -- Level 2
    case when @x <= 80 then -- Level 3
        case when @x <= 70 then -- Level 4
            case when @x <= 60 then -- Level 5
                case when @x <= 50 then -- Level 6
                    case when @x <= 40 then -- Level 7
                       case when @x <= 30 then --Level 8
                         case when @x <= 20 then--Level 9
                           case when @x<= 10 then--Level 10
                                100
End End End End End End End End End 
End as "Nested Case" 
--Ending all levels!

8-nested-case-upto-10-levels
Image 8-Nested-Case-Upto-10-Levels

If nesting is exceeding 10 levels, then SQL Server throws an error.
Declare @x int
set @x = 1
Select 
case when @x <= 100 then  -- Level 1
case when @x <= 90 then -- Level 2
case when @x <= 80 then -- Level 3
    case when @x <= 70 then -- Level 4
        case when @x <= 60 then -- Level 5
            case when @x <= 50 then -- Level 6
                case when @x <= 40 then -- Level 7
                   case when @x <= 30 then --Level 8
                     case when @x <= 20 then--Level 9
                       case when @x<= 10 then--Level 10
                          case when @x <= 11 then -- 11
                             100
End End End End End End End End End End 
End as "Nested Case"

Msg 125, Level 15, State 4, Line 14
Case expressions may only be nested to level 10.

April 22, 2016

SQL Server CASE Statement and CASE WHEN Examples

Many a times there’s a need to create a derived column in an output, based on some condition. The condition is similar to the typical if construct we use if many programming languages.

In SQL scripting, we can use Case expressions or Case Statements as you may call them, to create a derived column based on a condition.

Also read about Nested Case Statements in SQL Server.

Let us first execute a sample database creation script that we will be using in our examples:

Use Master
go
Create Database Case_Demo
go
Use Case_Demo
go
CREATE TABLE EMP
       (EMPNO Integer,
        ENAME Varchar(15),
        JOB Varchar(9),
        MGR Integer,
        HIREDATE datetime,
        SAL Integer,
        COMM Integer,
        DEPTNO Integer)
go
INSERT INTO EMP VALUES
        (7369, 'SMITH', 'CLERK', 7902, '12/17/80', 800, NULL, 20)
INSERT INTO EMP VALUES
        (7499, 'ALLEN', 'SALESMAN', 7698, '5/20/81', 1600, 300, 30)
INSERT INTO EMP VALUES
        (7521, 'WARD', 'SALESMAN', 7698, '5/22/81', 1250, 500, 30)
INSERT INTO EMP VALUES
        (7566, 'JONES', 'MANAGER',   7839, '4/2/81', 2975, NULL, 20)
INSERT INTO EMP VALUES
        (7654, 'MARTIN', 'SALESMAN', 7698, '9/28/81', 1250, 1400, 30)
INSERT INTO EMP VALUES
        (7698, 'BLAKE', 'MANAGER',   7839, '5/1/81', 2850, NULL, 30)
INSERT INTO EMP VALUES
        (7782, 'CLARK', 'MANAGER',   7839, '6/9/81', 2450, NULL, 10)
INSERT INTO EMP VALUES
        (7788, 'SCOTT',  'ANALYST',  7566, '12/9/82', 3000, NULL, 20)
INSERT INTO EMP VALUES
        (7839, 'KING',   'PRESIDENT', NULL, '11/17/81', 5000, NULL, 10)
INSERT INTO EMP VALUES
        (7844, 'TURNER', 'SALESMAN', 7698, '9/8/1981',1500,    0, 30)
INSERT INTO EMP VALUES
        (7876, 'ADAMS', 'CLERK', 7788, '1/12/83', 1100, NULL, 20)
INSERT INTO EMP VALUES
        (7900, 'JAMES', 'CLERK', 7698,  '12/3/81', 950, NULL, 30)
INSERT INTO EMP VALUES
        (7902, 'FORD',   'ANALYST', 7566, '12/3/81', 3000, NULL, 20)
INSERT INTO EMP VALUES
        (7934, 'MILLER', 'CLERK',  7782, '1/23/82', 1300, NULL, 10)
go

There are two types of Case expressions:

1. Case acting as a Switch Case construct.
2. Case acting like an If…Else If….Else construct.

Switch Case Construct

Let us see an example of Case acting as a Switch Case construct.

The syntax is as follows:

Case Expression or Column Name
    When Value1 Then Statement1 or Expression 1
    When Value2 Then Statement2 or Expression 2
    When Value3 Then Statement3 or Expression 3
    .......................................................
    .......................................................
    When Value m Then Statement m or Expression m    
    Else
        Statement n or Expression n
End

Note: The Value above will be the actual value from a column, i.e. string or numeric or date value.

The Statement can be a string , numeric or date value. Expression can be either a derived value such as [Column Name] * 12 or using a T-SQL function Lower([Column Name])

Let us see an example to make things clearer.

Let us display the employee name, job and job_points for each record. Now job_points will be a derived column based on a condition that says - If job is of Analyst then job_points will be 100, if it is clerk then 200, else for any other job type it will be 300.

select ename, job,  case job    
    when 'Analyst' then 100
    when 'Clerk' then 200
    else 300
         end as "Job Points"
from emp

1-case-as-switch-case
Image 1: Case-As-Switch-Case


Else part is optional. If it is not mentioned, then NULL will be returned for the default values.
Let us see the same

select ename, job, case job    
    when 'Analyst' then 100
    when 'Clerk' then 200
    end as "Job Points"
from emp

2-case-without else
Image 2: Case-Without Else

All the statements mentioned in the then clause should be of the same data type.

select ename, job, case job    
    when 'Analyst' then 100
    when 'Clerk' then 'Two Hundred'
    end as "Job Points"
from emp

The above query throws an error since at first the when clause is 100 i.e. a numeric value, but the next when clause has 'Two Hundred' as a value of string type.

The error message is as follows:

Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value 'Two Hundred' to data type int.

As I mentioned earlier, All the statements mentioned in the then clause should be of the same data type. So to fix this error, we need to enclose 100 in single quotes to make it a Varchar value.

select ename, job, case job    
    when 'Analyst' then '100'
    when 'Clerk' then 'Two Hundred'
    end as "Job Points"
from emp

Now we will get the desired output -

3-case-after-rectification
Image 3: Case-After-Rectification


IF ELSE Construct


Let us now see how to use the Case expression as an If…Else If….Else construct.
The syntax will be as follows:

Case 
    When Boolean Condition 1 Then Statement1 or Expression 1
    When Boolean Condition 2 Then Statement2 or Expression 2
    When Boolean Condition 3 Then Statement3 or Expression 3
    ........................................................
    ........................................................
    When Boolean Condition m Then Statement m    or Expression m
    Else
        Statement n or Expression n
End


Note: In this syntax, after the CASE keyword there is no Expression or Column Name. We directly have a When clause. The When clause will have a Boolean condition in which the column name or expression will be included.

Let us see an example of this.

Let us create a query which will show employee name, salary and salary class. If the salary is greater than or equal to 5000 then salary class will be A, if it is between 2500 and 4999 then it will be B and for remaining salaries, values will be C.

select ename, sal, case 
    when sal >= 5000 then 'A'
    when sal >= 2500 then 'B'
    else
    'C'
           end as "Salary Class"
From Emp

4-case-if-else
Image 4: Case-If-Else

Note: The sequence of Boolean conditions will matter a lot. Whichever condition is true, its statement or expression gets executed and it comes out of the CASE expression for that record.

So if we swap the first two conditions in the above example, the output will go haywire.

Let us see an example

select ename, sal, case 
    when sal >= 3000 then 'B'
    when sal >= 5000 then 'A'
    else
    'C'
           end as "Salary Class"
From Emp

5-case-logically-wrong
Image 5: Case-Logically-Wrong

Note: To avoid such logical errors, it will be better to avoid relational operators like >,>=. Instead use the between operator, mention the exact range and after doing this, if the conditions are given in any sequence, the output will be logically correct. Alternatively if you are going to continue with >,>= operators, then strictly all the conditions should be descending or all should be ascending.

We can mention multiple conditions in each WHEN clause using logical and/or operators.

The following example will depict this point.

select ename, job, sal, case 
    when job = 'clerk' and sal < 1000 then '1'
    when job = 'clerk' and sal > 1000 then '2'
    when job = 'manager' and sal > 2900 then '3'
               end as "Bonus Grade"
From Emp

6-case-multiple-conditions-in-when
Image 6-Case-Multiple-Conditions-In-When

We explored the SQL Server CASE statement and also saw the CASE WHEN example. In the next article, we will explore Nested Case Statements in SQL Server. and use CASE in Having, Order By and UPDATE statements.

April 12, 2016

PIVOT and UNPIVOT in SQL Server

PIVOT and UNPIVOT are aggregate operators which are available from SQL Server version 2005 onwards.

PIVOT can be used to transform rows into columns while UNPIVOT can be used to transform columns into rows. We will see how to use PIVOT and UNPIVOT in the following examples

Let us create the following table which has sales details of products for each country.

create table #sales (country_name varchar(100),product_name varchar(100), 
sales_date datetime, sales_amount decimal(12,2))

insert into #sales (country_name,product_name,sales_date,sales_amount)
select 'India','Television','2012-01-10',35000 union all
select 'India','Mobile','2012-12-19',22000 union all
select 'India','Laptop','2012-04-11',62500 union all
select 'India','Laptop','2013-06-23',45000 union all
select 'India','Television','2012-03-20',45000 union all
select 'India','Television','2013-05-30',56000 union all
select 'India','Mobile','2013-02-22',71200 union all
select 'USA','Television','2012-02-20',3500 union all
select 'USA','Mobile','2012-11-01',2700 union all

select 'USA','Laptop','2012-08-19',6500 union all
select 'USA','Laptop','2013-06-23',5000 union all
select 'USA','Television','2012-02-12',4560 union all
select 'USA','Television','2013-06-30',5100 union all
select 'USA','Mobile','2013-006-06',2200 

SQL Server PIVOT table Example

Suppose you want to find out the total sales by each product and summarised for each year (ie columns are years). You can use the following code

SELECT product_name,[2012],[2013] from
(select year(sales_date) as sales_year,product_name,sales_amount FROM #sales)
as t 
PIVOT(SUM(sales_amount) 
      FOR sales_year IN ([2012],[2013])) AS pivot_table
pivot-product

If you want to group it by multiple columns, you can simply include that column in the query. For example the following query will do pivot group by country_name and product_name

SELECT country_name,product_name,[2012],[2013] from
(select year(sales_date) as sales_year, country_name, product_name, 
sales_amount FROM #sales) 
as t 
PIVOT(SUM(sales_amount) 
      FOR sales_year IN ([2012],[2013])) AS pivot_table
pivot-country-product

SQL Server UnPivot Table Example

To see how unpivot works, let us create the following table

create table #marks(student_name varchar(100), English smallint, 
Mathematics smallint, Tamil smallint, Science smallint)
insert into #marks(student_name , English, Mathematics, Tamil, Science)
select 'Sankar', 78,91,79,60 union all
select 'Nilesh', 81,90,66,89 union all
select 'Murugan', 94,88,72,90 
Suppose you want to transform the columns into rows i.e. subjects become rows, use the following query:

select student_name, subject_name, mark
from #marks s
unpivot
(
  mark
  for subject_name in (English,Mathematics, Tamil,Science)
) t;
sqlserver-unpivot

Note : PIVOT in SQL Server basically produces a denormalised dataset whereas UNPIVOT produces normalised dataset.

March 29, 2016

SQL Server Inner Join - Concepts and Best practices

In a database such as SQL Server, we create tables which are related to each other through a common column. For example, if we have to store customers’ related information, then the customer’s personal information gets stored in one table and the same customer’s transaction information, gets stored in another table. Both the tables have CustID as a common column.

We can implement RDBMS (Relational Database Management System) concepts through any database, such as SQL Server. So the same entity’s information flows down in different tables. This is part of the Normalization process within RDBMS framework.

However as part of the requirement, we may need to show a report containing customer information from multiple tables, on a matching column condition. Here arises the need of using the technique of Joins in SQL scripting.

Example - Let us create an Employee specific set of 2 tables in a new database.

Execute the following script in your instance of SQL Server.

Use Master
go
Create Database Inner_Join
go
Use Inner_Join
go
CREATE TABLE EMP
       (EMPNO Integer,
        ENAME Varchar(15),
        JOB Varchar(9),
        MGR Integer,
        HIREDATE datetime,
        SAL Integer,
        COMM Integer,
        DEPTNO Integer)
go
INSERT INTO EMP VALUES
        (7369, 'SMITH', 'CLERK', 7902, '12/17/80', 800, NULL, 20)
INSERT INTO EMP VALUES
        (7499, 'ALLEN', 'SALESMAN', 7698, '5/20/81', 1600, 300, 30)
INSERT INTO EMP VALUES
        (7521, 'WARD', 'SALESMAN', 7698, '5/22/81', 1250, 500, 30)
INSERT INTO EMP VALUES
        (7566, 'JONES', 'MANAGER',   7839, '4/2/81', 2975, NULL, 20)
INSERT INTO EMP VALUES
        (7654, 'MARTIN', 'SALESMAN', 7698, '9/28/81', 1250, 1400, 30)
INSERT INTO EMP VALUES
        (7698, 'BLAKE', 'MANAGER',   7839, '5/1/81', 2850, NULL, 30)
INSERT INTO EMP VALUES
        (7782, 'CLARK', 'MANAGER',   7839, '6/9/81', 2450, NULL, 10)
INSERT INTO EMP VALUES
        (7788, 'SCOTT',  'ANALYST',  7566, '12/9/82', 3000, NULL, 20)
INSERT INTO EMP VALUES
        (7839, 'KING',   'PRESIDENT', NULL, '11/17/81', 5000, NULL, 10)
INSERT INTO EMP VALUES
        (7844, 'TURNER', 'SALESMAN', 7698, '9/8/1981',1500,    0, 30)
INSERT INTO EMP VALUES
        (7876, 'ADAMS', 'CLERK', 7788, '1/12/83', 1100, NULL, 20)
INSERT INTO EMP VALUES
        (7900, 'JAMES', 'CLERK', 7698,  '12/3/81', 950, NULL, 30)
INSERT INTO EMP VALUES
        (7902, 'FORD',   'ANALYST', 7566, '12/3/81', 3000, NULL, 20)
INSERT INTO EMP VALUES
        (7934, 'MILLER', 'CLERK',  7782, '1/23/82', 1300, NULL, 10)
go
CREATE TABLE DEPT
       (DEPTNO INTEGER,
        DNAME VARCHAR(14),
        LOC VARCHAR(13) )
go
INSERT INTO DEPT VALUES (10,'ACCOUNTING','NEW YORK')
INSERT INTO DEPT VALUES (20,'RESEARCH','DALLAS')
INSERT INTO DEPT VALUES (30,'SALES','CHICAGO')
INSERT INTO DEPT VALUES (40,'OPERATIONS','BOSTON')
go

Let us see the tables to understand the relationship.

use Inner_Join
go
select * from dept
go
select * from emp
go

dept-no

You can observe that Deptno is the common field in both the tables, Dept & Emp. Department details are in Dept table and Employee table contains a DEPTNO column which represents the Department Number an Employee belongs to. Let us see two records of Emp table to get a clarity. Consider the record of SMITH in Emp table. Smith’s Deptno is 20, which means Smith’s Department Name (DNAME) is RESEARCH. Similarly consider ALLEN’s record. His Deptno is 30, which means Allen’s DNAME is Sales. So for the first two records, we have mapped the common column Deptno!
Now we want to display the ENAME and DNAME for the matching Deptno. This is possible by a technique within SQL scripting that is called as Joins.

There are multiple types of joins, but in this article we are going to focus on Inner Join or also known as Equi-Join.

Editorial Note: Check this artilce to understand Types of JOIN in SQL Server - Inner, Self, Outer and Cross JOIN

So what is Inner Join? Inner join is displaying data from multiple tables on matching values of common field (s), within the associated tables.

Inner Join Syntax

There are two types of syntaxes for Inner join:

1) Non-ANSI
2) ANSI

Let us see the Non-ANSI syntax first:

select ename, dname
from emp, dept
where emp.deptno = dept.deptno

non-ansi-sql

One thing to note is that in the from clause of SQL statement, we have two tables separated by a comma. The mapping of common column is done in the where clause.

How does it work internally? Whenever a JOIN command is given, then each record of the first table is evaluated with each record of the second table. That means a Cartesian product happens and then as per the matching values of where clause, those records are shown. As per the Emp and Dept tables, 14 records of Emp are cross evaluated with 4 records of Dept table. So overall 14 * 4 = 56 evaluations will happen and matching value records are shown.

Now, let us remove the where clause and see what happens.

select ename, dname
from emp, dept

where-clause
rows-affected

Oops!! The Cartesian product result is displayed now! Due to the absence of where clause, each evaluation has been displayed as well. So this output is logically going wrong! Actually we have accidentally landed in another type of join that is Cross Join.

Now, let us have the common field for display purpose. We will include the Deptno column along with Ename and Dname columns in the column list.

select ename, dname, deptno
from emp, dept
where emp.deptno = dept.deptno

It has thrown the following error –

ambiguous-column

The common field’s name is same in both the tables. So a confusion arose in understanding that deptno from which table should be considered. The error would not have come if the common column had different name. In such a case where name is same, it is mandatory to mention the table name and . (dot) separator as prefix .

Let us rectify the above query.

select ename, dname, emp.deptno
from emp, dept
where emp.deptno = dept.deptno

inner-join-result

Ideally table name prefix should be mentioned for each column irrespective of whether it is common or uncommon. Advantage of this approach is that it will give clarity to any other user who is not habitual with the tables & its columns.

So the same query will be written as follows:

select emp.ename, dept.dname, emp.deptno
from emp, dept
where emp.deptno = dept.deptno

But in projects, table names many a times are lengthy. So mentioning a lengthy table name repeatedly the will be cumbersome. To avoid that, introduce a table alias in the from clause. Use that alias in the column list and where clause.

JOIN – Best practice

So now I am coming to the best practice of writing a JOIN statement.

select e.ename, d.dname, e.deptno
from emp e, dept d
where e.deptno = d.deptno

Internally, from clause is getting executed first, then the where clause and finally the select clause as per the above statement. 

INNER JOIN – ANSI Syntax

So far we have seen the NON-ANSI syntax of joining tables. Now let us see the ANSI syntax. As per ANSI Joins, there are two changes:

1. The comma within from clause gets replaced by Join type specification
2. The JOIN condition has to be given using ON clause instead of where clause.

The select statement will look like this:

select e.ename, d.dname, e.deptno
from emp e Inner Join dept d
On e.deptno = d.deptno

Amongst the multiple types of joins, the default type of join is an Inner Join. So Inner is an optional keyword. The query can have only Join as the keyword which will be as follows: 

select e.ename, d.dname, e.deptno
from emp e Join dept d
On e.deptno = d.deptno 

In Non-ANSI style of join, we have observed that if the where clause is skipped, then it shows a Cartesian product output. On the same lines, let us try to omit the On clause in ANSI and let’s see what happens.

select e.ename, d.dname, e.deptno
from emp e Join dept d

It is throwing an error.

incorrect-syntax

It means that On clause is mandatory in ANSI type of join. The advantage of ANSI join is that it is a dedicated inner join and it prevents an accidental Cartesian product.

About one-to-many relationship for Inner join

The inner or equi join will be logically correct if we have one-to-many relationship between the values of common field. In the Emp and Dept tables, there is such relationship, so the output is correct, i.e a matching deptno is once in the dept table and it is multiple times in the emp table.

What if there is a many-to-many relationship within the tables? Right now the deptno column is not having a primary constraint in the Dept table. So we can add one more record of the same deptno 10.

Insert into Dept Values(10, 'Testing', 'CALIFORNIA')
Go

So right now there are 2 records of the same deptno 10 in dept table.

Let us see that –

select * from dept

duplicate-records

Now we will execute the same ANSI join statement again.

select e.ename, d.dname, e.deptno
from emp e Join dept d
On e.deptno = d.deptno 

The output has logically gone wrong!

wrong-output

CLARK, KING and MILLER are displayed in both Accounting & Testing departments.

An Inner join will give correct result only when there is one-to-many relationship. Therefore normally this problem gets eliminated when we have Primary Key and Foreign Key relationship within the common field.

Let us delete this additional record of deptno 10 so that everything goes back to normal.

Delete from Dept
where DName = 'Testing'
go

Important features of Inner Join

1. Common field name may be same or different.
2. The data type and size of common fields should be ideally the same.
3. If the data type is not same, then using type casting functions the data types should be made same in the On clause.
4. There should be one-to-many relationship within the common field values.
I hope through this article, your Inner Join principle now is as sound as a dollar!

March 20, 2016

Shrink a Live SQL Server Database and Logs - Caveats and Best Practices

If you have deleted a lot of data in the database and truncated some tables, probably you may want to shrink database files to recover the unused space. Basically shrinking will move the data pages at the end of the file into unoccupied space, available in the front of the file.

You can use the DBCC SHRINKDATABASE command. For example, this command

DBCC SHRINKDATABASE (your_database,10) 

will decrease database size and allow for 10 percent free space. 

You can also shrink log file alone using DBCC SHRINKFILE command Take full back up of the database and run the following :

ALTER DATABASE your_database SET RECOVERY SIMPLE; 

DBCC SHRINKFILE (your_database_Log, 10); 

ALTER DATABASE your_database SET RECOVERY FULL; 

It will set the log file size to 10 MB 

Caveats of shrinking database and logs:

1. SHRINK commands should not be done frequently as it will increase the index fragmentation.

2. It is a fully logged operation which will increase the log size and slows down the performance

3. Shrinking is not possible while the current database is being backed up up or restored

4. Shrinking is not possible when columnstore index is enabled

Best practices

1. Use SHRINK only if a lot of operations such as TRUNCATE and DROP has lead to unused space

2. Take full backup before shrinking database files. If you shrink log file, set the Recovery mode into SIMPLE before shrinking and make it FULL after shrinking

3. Never turn ON AUTO SHRINK option

4. Always rebuild indexes after SHRINKING the database files to avoid fragmentation

March 10, 2016

Add an Identity to an Existing Column in large SQL Server tables

Suppose you have a table with a large amount of data without an identity column and you want to add an IDENTITY property to the existing column. It is important to note that in SQL Server you cannot alter a column to have an identity property. So how do we implement this?

There are three options.

Consider the following dataset:

create table testing (id int not null, value decimal(12,2))

insert into testing(id,value)
select 1,rand()*10000 union all
select 2,rand()*10000 union all
select 3,rand()*10000 union all
select 4,rand()*10000 union all
select 5,rand()*10000

Method 1: Add a new column with IDENTITY property:

alter table testing add id_new int identity(1,1);

Now the table testing will have data with identity values and you can just use this identity column for any references.

Method 2: Create a new table with identity column and move the data over there

create table testing1 (id int identity(1,1), value decimal(12,2))

Insert into testing1(value)
select value from testing;

Now the table testing1 will have data with identity values and just like with Method 1, you can use this identity column for any reference.

Method 3: Use ALTER TABLE SWITCH option

Create a new table with the same schema as that of the table testing but with IDENTITY property

create table testing2 (id int identity(1,1), value decimal(12,2))

Now use the SWITCH option to move data to this table

alter table testing switch to testing2;

Now all the data is available in testing2 which has an identity column.

Now just drop testing table and rename testing2 to testing

drop table testing;
EXEC sp_rename 'testing2','testing'

Now the table testing has an IDENTITY column with all the data.

Now the question is "which method is the best to use?"

Well, Methods 1 and 2 may be time consuming if the source table has millions of rows. Method 3 is very fast as it switches the partition from one table to another very quickly. So this method will take only few seconds compared to the other two methods.

Caveats: You may need to take care of primary key - foreign key relations before using any of these methods.

February 25, 2016

CUBE and ROLLUP with GROUPBY in SQL Server

CUBE and ROLLUP are SQL Server operators which are always used along with a GROUP BY clause. These operators perform multi level aggregations at each column specified in the GROUP BY Clause. ROLLUP will do aggregation in the following manner:

Operation : ROLLUP(col1,col2)
Groupings : col1,col2
            col1, ALL(NULL)
            ALL (NULL)

CUBE will do aggregation in the following manner:

Operation : CUBE(col1,col2)
Groupings : col1,col2
            col1
        ALL(NULL), col2
            ALL(NULL)

Let us create the following dataset and explore the resultset:

create table #sales_data(region varchar(30), sales_date datetime, 
sales_amount decimal(12,2))

truncate table #sales_data
insert into #sales_data(region,sales_Date,sales_amount)
select 'South Asia', '2014-01-01',30000 union all
select 'South Asia', '2014-05-01',72000 union all
select 'South Asia', '2015-05-01',6700 union all
select 'North America', '2014-03-01',12500 union all
select 'North America', '2015-05-01',80000 union all
select 'North America', '2015-05-01',9000 union all
select 'Australia', '2014-02-01',88000 union all
select 'Australia', '2015-01-01',144000 union all
select 'Australia', '2015-05-01',178000 

Using ROLLUP

select region, year(sales_date) as sales_year, 
sum(sales_amount) as total_spent  from #sales_data
group by region, year(sales_date)
with ROLLUP

The result of the above statement is as follows:

region                         sales_year  total_spent
------------------------------ ----------- ---------------
Australia                      2014        88000.00
Australia                      2015        322000.00
Australia                      NULL        410000.00
North America                  2014        12500.00
North America                  2015        89000.00
North America                  NULL        101500.00
South Asia                     2014        102000.00
South Asia                     2015        6700.00
South Asia                     NULL        108700.00
NULL                           NULL        620200.00

As you can see, Grouping is done in three cases.

1. Grouping by the columns region and sales_year
2. Grouping by the columns region only
3. Grouping by no columns

Using CUBE

select region, year(sales_date) as sales_year, 
sum(sales_amount) as total_spent  from #sales_data
group by region, year(sales_date)
with CUBE

The result of the above statement is as follows:

region                         sales_year  total_spent
------------------------------ ----------- -----------------
Australia                      2014        88000.00
North America                  2014        12500.00
South Asia                     2014        102000.00
NULL                           2014        202500.00
Australia                      2015        322000.00
North America                  2015        89000.00
South Asia                     2015        6700.00
NULL                           2015        417700.00
NULL                           NULL        620200.00
Australia                      NULL        410000.00
North America                  NULL        101500.00
South Asia                     NULL        108700.00

As you can see, Grouping is done in four cases.

1. Grouping by the columns region and sales_year
2. Grouping by the column sales_year only
3. Grouping by the column region
4. Grouping by no columns

You can use both of these operators to achieve aggregation at multiple levels. CUBE performs more number of aggregations with multiple combination of the grouping columns.

If you just want to perform multiple groupings not excluding the first column, you can use ROLLUP, otherwise use CUBE.

Note: Usage of ROLLUP and CUBE in SQL Server are time-consuming options as they do groupings at multiple level. If you use a reporting tool to display the result, you should avoid doing it in SQL and do it in the reporting tool itself .

February 17, 2016

SQL Server - Time difference in HH hours, MM minutes and SS seconds format‏

Suppose you have two datetime values and want to express the difference in the format HH Hours, MM minutes and SS seconds. You can use CONVERT function with style 108 as shown below:

declare @start_date datetime, @end_date datetime
select @start_date ='20121210 11:19:33', @end_date='20121210 19:28:12'      


select stuff(stuff(convert(varchar(10),@end_date-@start_date,108),6,1,' Minutes and '),3,1,' Hours, ')+' Seconds'


The variables @start_date and @end_date have values for start and end times. We can directly substract @start_date from @end_date and the result is still in datetime. CONVERT with style 108 will only extract time part in the format HH:MM:SS.

Now replace : after HH with Hours, replace : after MM with minutes and , append ' Seconds' at the end of time and the result is at the format HH Hours, MM minutes and SS seconds.

The result of the above code is

08 Hours, 08 Minutes and 39 Seconds

January 31, 2016

SQL Server: Return Newly Inserted Row

Whenever you insert a new row in a SQL Server table that contains default values, you may want to return the newly inserted row to the user, along with the default value generated. However while doing a plain INSERT statement, all you get is the total number of rows affected.

Here’s an example:

insert-row

Here we are using the NEWID() to generate a unique customer number in each row. However as you can see, the user does not get to see the default ID that got generated for the newly inserted rows. In order to get back the row values that were inserted, use the OUTPUT clause of the INSERT statement as shown here:

INSERT INTO #TT (Name, AreaCode)
OUTPUT INSERTED.ID, INSERTED.Name, INSERTED.AreaCode
SELECT 'Suprotim', 2355 UNION ALL
SELECT 'Anush', 2388
 
and this time you get to see the newly inserted rows:

sql-server-output-insert

As you can see, we have added the OUTPUT clause right after the INSERT statement. The rows inserted into the table are captured in the virtual table INSERTED and returned back as a result set.

In case you are wondering, yes it is possible to capture the result set in a table or table variable. Assuming there is a Table variable called @TempTbl, just use the following:

INSERT INTO #TT (Name, AreaCode)
OUTPUT INSERTED.ID, INSERTED.Name, INSERTED.AreaCode
INTO @TempTbl
SELECT 'Suprotim', 2355 UNION ALL
SELECT 'Anush', 2388
 
and now you can do further processing on this data using @TempTbl.

January 25, 2016

SQL Server 2012 Query Pagination

This article was authored by Praveen Dabade.

In SQL Server 2012, Microsoft introduced a couple of T-SQL Enhancements. One of them is Query Pagination which we will explore today.

For this demonstration, I am using the Northwind database to demonstrate this new feature. Now in most of the applications, a common requirement is how to fetch the data from the database servers, page wise.

In earlier versions of SQL Server like SQL Server 2005/2008/R2, we can implement Pagination by using different techniques. For example, we implement pagination using ROW_NUMBER() function or CTE - Common Table Expression.

In SQL Server 2012, Microsoft has introduced Pagination as a part of Select query in a Order By clause. Now you will have to use OFFSET and FETCH NEXT with the order by clause.

Let's take a look at a few examples. I am using Northwind database for this demonstration. I have created a Stored Procedure which takes two parameters. First parameter takes the page number and the second parameter ask you to fetch the no. of records for that page. The stored procedure code is as below -

query-pagination

If you execute the above stored procedure

EXEC FetchPagedRecords 2,10

you will get the following results -

image

The OFFSET specifies the number of rows to skip before it starts returning the rows and FETCH NEXT specifies the number of rows to be returned.

Microsoft has introduced an easy way of implementing Data Paging in SQL Server 2012 by adding OFFSET and FETCH NEXT in an Order By clause. I hope you will use it in your applications.