Home »

Pivot function in Oracle

Start here

Advertisements

PIVOT is a SQL operation, introduced in Oracle 11g, that lets you write cross tabulation (also called transposed, crosstab and matrix) queries that rotate rows into columns while aggregating data in the rotation process.

With Oracle 11g, we can do as following :

SELECT *
  FROM (SELECT job, deptno, sum(sal) sal FROM emp GROUP BY job, deptno)
         PIVOT ( sum(sal) FOR deptno IN (10, 20, 30, 40) );

JOB               10         20         30         40
--------- ---------- ---------- ---------- ----------
CLERK           1300       1900        950
SALESMAN                              5600
PRESIDENT       5000
MANAGER         2450       2975       2850
ANALYST                    6000

For Oracle versions prior to 11g, the same results can be obtained with this (somewhat bulkier) query:

SELECT  *
  FROM  (SELECT job,
                sum(decode(deptno,10,sal)) DEPT10,
                sum(decode(deptno,20,sal)) DEPT20,
                sum(decode(deptno,30,sal)) DEPT30,
                sum(decode(deptno,40,sal)) DEPT40
           FROM scott.emp
       GROUP BY job)
ORDER BY 1;
Some pivot option Sample

Our data shows summaries of retail sales events. There are some points of interest to note.

1) These are summary rows by terminal, by day, hence there is a total amount, a total discount and a total tax.
2) This summary contains the same data twice, because we have physically stored it for two different currencies. In this example, the sale was made in US dollars (Original Currency), and a second row was created for Canadian currency (Base Currency) (our company is maybe Headquartered in Canada?).

SQL> desc retail_sale
 Name                                Null?    Type
 ----------------------------------- -------- --------------
 RETAIL_SALE_ID                      NOT NULL NUMBER
 TERMINAL_ID                         NOT NULL NUMBER
 SALE_DATE                           NOT NULL DATE
 CURRENCY_CATEGORY_CODE                       VARCHAR2(8)
 TTL_AMT                                      NUMBER
 TTL_DIS                                      NUMBER
 TTL_TAX                                      NUMBER

This shows the data as two rows. Consider this the original format we are given.

SQL> select *
  2  from retail_sale
  3  /

                                     Currency
RETAIL_SALE_ID TERMINAL_ID SALE_DATE Category TTL_AMT TTL_DIS TTL_TAX
-------------- ----------- --------- -------- ------- ------- -------
             1        1.00 14-JUN-07 ORIGINAL  100.00   20.00    4.80
             2        1.00 14-JUN-07 BASE      117.00   23.40    5.62

2 rows selected.

Using the proper Pivot(s), we could change the two rows into one row with multiple columns, letting column names distinguish what the data is.

TERMINAL_ID SALE_DATE BASE_TTL_AMT BASE_TTL_DIS BASE_TTL_TAX ORIGINAL_TTL_AMT ORIGINAL_TTL_DIS ORIGINAL_TTL_TAX
----------- --------- ------------ ------------ ------------ ---------------- ---------------- ----------------
       1.00 14-JUN-07       117.00        23.40         5.62           100.00            20.00             4.80

1 row selected.

Or using different Pivot(s), we could change the two rows into six rows, by employing a generic amount field, and adding a rowtype discriminator (Dollar Name in this example).

                                     Currency Dollar
RETAIL_SALE_ID TERMINAL_ID SALE_DATE Category Name         AMT
-------------- ----------- --------- -------- -------- -------
             1        1.00 14-JUN-07 ORIGINAL TTL_AMT   100.00
             1        1.00 14-JUN-07 ORIGINAL TTL_DIS    20.00
             1        1.00 14-JUN-07 ORIGINAL TTL_TAX     4.80
             2        1.00 14-JUN-07 BASE     TTL_AMT   117.00
             2        1.00 14-JUN-07 BASE     TTL_DIS    23.40
             2        1.00 14-JUN-07 BASE     TTL_TAX     5.62

6 rows selected.

Or using yet again different Pivot(s), we could turn our two rows into three rows. This one is most interesting as it involves two pivot operations. Hence we are turning rows into columns and then columns into rows (or was it columns into rows and then rows into columns? (or does the order of pivots even matter?)).
                      Dollar
TERMINAL_ID SALE_DATE Name     BASE_AMT ORIGINAL_AMT
----------- --------- -------- -------- ------------
       1.00 14-JUN-07 TTL_AMT    117.00       100.00
       1.00 14-JUN-07 TTL_DIS     23.40        20.00
       1.00 14-JUN-07 TTL_TAX      5.62         4.80

3 rows selected.

Here is the SQL for building the example:

1) turn rows into columns
2) turn columns into rows

drop table retail_sale
/

create table retail_sale
(
 retail_sale_id number not null
,terminal_id number not null
,sale_date date not null
,currency_category_code varchar2(8) -- base, original
,ttl_amt number
,ttl_dis number
,ttl_tax number
)
/

create unique index retail_sale_pk on retail_sale (retail_sale_id)
/
create unique index retail_sale_uk1 on retail_sale (terminal_id,sale_date,currency_category_code)
/

alter table retail_sale
   add constraint retail_sale_pk primary key (retail_sale_id)
   add constraint retail_sale_uk1 unique (terminal_id,sale_date,currency_category_code)
/

insert into retail_sale values (1,1,trunc(sysdate),'ORIGINAL',100,20,4.80);
insert into retail_sale values (2,1,trunc(sysdate),'BASE',100*1.17,20*1.17,round(4.8*1.17,2));

commit
/
set numformat 990.99
col retail_sale_id format 990
col currency_categorY_code format a8 head 'Currency|Category'
col dollar_name format a8 head 'Dollar|Name'

select *
from retail_sale
/

select
       terminal_id
      ,sale_date
      ,case when currency_category_code = 'BASE'     then ttl_amt end base_ttl_amt
      ,case when currency_category_code = 'BASE'     then ttl_dis end base_ttl_dis
      ,case when currency_category_code = 'BASE'     then ttl_tax end base_ttl_tax
      ,case when currency_category_code = 'ORIGINAL' then ttl_amt end original_ttl_amt
      ,case when currency_category_code = 'ORIGINAL' then ttl_dis end original_ttl_dis
      ,case when currency_category_code = 'ORIGINAL' then ttl_tax end original_ttl_tax
from retail_sale
/

select
       terminal_id
      ,sale_date
      ,sum(case when currency_category_code = 'BASE'     then ttl_amt end) base_ttl_amt
      ,sum(case when currency_category_code = 'BASE'     then ttl_dis end) base_ttl_dis
      ,sum(case when currency_category_code = 'BASE'     then ttl_tax end) base_ttl_tax
      ,sum(case when currency_category_code = 'ORIGINAL' then ttl_amt end) original_ttl_amt
      ,sum(case when currency_category_code = 'ORIGINAL' then ttl_dis end) original_ttl_dis
      ,sum(case when currency_category_code = 'ORIGINAL' then ttl_tax end) original_ttl_tax
from retail_sale
group by
       terminal_id
      ,sale_date
/


select retail_sale_id
      ,terminal_id
      ,sale_date
      ,currency_category_code
      ,case
           when dollar_type.rowno = 1 then 'TTL_AMT'
           when dollar_type.rowno = 2 then 'TTL_DIS'
           when dollar_type.rowno = 3 then 'TTL_TAX'
       end dollar_name
      ,case
           when dollar_type.rowno = 1 then ttl_amt
           when dollar_type.rowno = 2 then ttl_dis
           when dollar_type.rowno = 3 then ttl_tax
       end amt
from retail_sale
    ,(
      select rownum rowno
      from dual
      connect by level <= 3
     ) dollar_type
order by 1,2,3,5
/


select terminal_id
      ,sale_date
      ,case
           when dollar_type.rowno = 1 then 'TTL_AMT'
           when dollar_type.rowno = 2 then 'TTL_DIS'
           when dollar_type.rowno = 3 then 'TTL_TAX'
       end dollar_name
      ,case
           when dollar_type.rowno = 1 then base_ttl_amt
           when dollar_type.rowno = 2 then base_ttl_dis
           when dollar_type.rowno = 3 then base_ttl_tax
       end base_amt
      ,case
           when dollar_type.rowno = 1 then original_ttl_amt
           when dollar_type.rowno = 2 then original_ttl_dis
           when dollar_type.rowno = 3 then original_ttl_tax
       end usd_amt
from (
      select
             terminal_id
            ,sale_date
            ,sum(case when currency_category_code = 'BASE'     then ttl_amt end) base_ttl_amt
            ,sum(case when currency_category_code = 'BASE'     then ttl_dis end) base_ttl_dis
            ,sum(case when currency_category_code = 'BASE'     then ttl_tax end) base_ttl_tax
            ,sum(case when currency_category_code = 'ORIGINAL' then ttl_amt end) original_ttl_amt
            ,sum(case when currency_category_code = 'ORIGINAL' then ttl_dis end) original_ttl_dis
            ,sum(case when currency_category_code = 'ORIGINAL' then ttl_tax end) original_ttl_tax
      from retail_sale
      group by
             terminal_id
            ,sale_date
     )
    ,(
      select rownum rowno
      from dual
      connect by level <= 3
     ) dollar_type
order by 1,2,3
/

So lets walk the pivots shall we…

From Two rows to One row (a row to column pivot):

Here we convert our data from two rows with three columns and a row type discriminator (Currency Category), to one row with six columns where each column name distinguishes the identity of the data. Note that the Currency Category column is no longer necessary after the pivot and in deed does not even fit our data after the pivot, so we drop it. Note also that we are forced to drop the primary key of our original table because it too no longer maps to our data after the pivot since two rows have been combined into one.

Ah but there is a little hitch. You see our first pivot query generated two rows, with six columns not one row with six columns. Each of the rows had only a portion of the result columns filled in (the columns that came from the original row that generated this one). What we want is a single row that contains values filled in for all columns.

The most straight forward way to get a single row with filled in columns is to group by the correct keys, which is essentially everything else in the table but the pivoted data. In the case of a non-numeric field you can use max or min instead of sum.

SQL> select *
  2  from retail_sale
  3  /

                                     Currency
RETAIL_SALE_ID TERMINAL_ID SALE_DATE Category TTL_AMT TTL_DIS TTL_TAX
-------------- ----------- --------- -------- ------- ------- -------
             1        1.00 14-JUN-07 ORIGINAL  100.00   20.00    4.80
             2        1.00 14-JUN-07 BASE      117.00   23.40    5.62

2 rows selected.

SQL> select
  2         terminal_id
  3        ,sale_date
  4        ,case when currency_category_code = 'BASE'     then ttl_amt end base_ttl_amt
  5        ,case when currency_category_code = 'BASE'     then ttl_dis end base_ttl_dis
  6        ,case when currency_category_code = 'BASE'     then ttl_tax end base_ttl_tax
  7        ,case when currency_category_code = 'ORIGINAL' then ttl_amt end original_ttl_amt
  8        ,case when currency_category_code = 'ORIGINAL' then ttl_dis end original_ttl_dis
  9        ,case when currency_category_code = 'ORIGINAL' then ttl_tax end original_ttl_tax
 10  from retail_sale
 11  /

TERMINAL_ID SALE_DATE BASE_TTL_AMT BASE_TTL_DIS BASE_TTL_TAX ORIGINAL_TTL_AMT ORIGINAL_TTL_DIS ORIGINAL_TTL_TAX
----------- --------- ------------ ------------ ------------ ---------------- ---------------- ----------------
       1.00 14-JUN-07                                                  100.00            20.00             4.80
       1.00 14-JUN-07       117.00        23.40         5.62

2 rows selected.

Oops, that looks like two rows, lets try again.

SQL> select
  2         terminal_id
  3        ,sale_date
  4        ,sum(case when currency_category_code = 'BASE'     then ttl_amt end) base_ttl_amt
  5        ,sum(case when currency_category_code = 'BASE'     then ttl_dis end) base_ttl_dis
  6        ,sum(case when currency_category_code = 'BASE'     then ttl_tax end) base_ttl_tax
  7        ,sum(case when currency_category_code = 'ORIGINAL' then ttl_amt end) original_ttl_amt
  8        ,sum(case when currency_category_code = 'ORIGINAL' then ttl_dis end) original_ttl_dis
  9        ,sum(case when currency_category_code = 'ORIGINAL' then ttl_tax end) original_ttl_tax
 10  from retail_sale
 11  group by
 12         terminal_id
 13        ,sale_date
 14  /

TERMINAL_ID SALE_DATE BASE_TTL_AMT BASE_TTL_DIS BASE_TTL_TAX ORIGINAL_TTL_AMT ORIGINAL_TTL_DIS ORIGINAL_TTL_TAX
----------- --------- ------------ ------------ ------------ ---------------- ---------------- ----------------
       1.00 14-JUN-07       117.00        23.40         5.62           100.00            20.00             4.80

1 row selected.

SQL>

From Two rows to Six rows (a column to row pivot):

Here we do a pivot in the opposite direction as before. We take multiple columns with unique names and stuff them into a single column of a more generic nature and then add a row type discriminator to tell use where each row came from (Dollar Name in this case). I used the simplistic method of making the row type discriminator values be the original column names. If you are following the modeling sub-thread then you recongnize that this row type discriminator become part of our unique key to the results rowset.

You will notice please that we used the generic row generator technique of connect by level against dual (very handy). This I believe is only good for 9i and above. We generate three rows this way, three because we have three columns to pivot and each column will turn into a row after our pivot. Basically this means we copy each row from our source data three times, and then depending upon which version of the row we have (1 or 2 or 3), we pick off a specific set of columns to put into our generic column. In our case we mapped 1 to TTL_AMT, 2 to TTL_DIS and 3 to TTL_TAX. You should always double check your work to make sure the row type discriminator you use is the correct label for the data value you have picked off.

SQL> select *
  2  from retail_sale
  3  /

                                     Currency
RETAIL_SALE_ID TERMINAL_ID SALE_DATE Category TTL_AMT TTL_DIS TTL_TAX
-------------- ----------- --------- -------- ------- ------- -------
             1        1.00 14-JUN-07 ORIGINAL  100.00   20.00    4.80
             2        1.00 14-JUN-07 BASE      117.00   23.40    5.62

2 rows selected.

SQL> select retail_sale_id
  2        ,terminal_id
  3        ,sale_date
  4        ,currency_category_code
  5        ,case
  6             when dollar_type.rowno = 1 then 'TTL_AMT'
  7             when dollar_type.rowno = 2 then 'TTL_DIS'
  8             when dollar_type.rowno = 3 then 'TTL_TAX'
  9         end dollar_name
 10        ,case
 11             when dollar_type.rowno = 1 then ttl_amt
 12             when dollar_type.rowno = 2 then ttl_dis
 13             when dollar_type.rowno = 3 then ttl_tax
 14         end amt
 15  from retail_sale
 16      ,(
 17        select rownum rowno
 18        from dual
 19        connect by level <= 3
 20       ) dollar_type
 21  order by 1,2,3,5
 22  /

                                     Currency Dollar
RETAIL_SALE_ID TERMINAL_ID SALE_DATE Category Name         AMT
-------------- ----------- --------- -------- -------- -------
             1        1.00 14-JUN-07 ORIGINAL TTL_AMT   100.00
             1        1.00 14-JUN-07 ORIGINAL TTL_DIS    20.00
             1        1.00 14-JUN-07 ORIGINAL TTL_TAX     4.80
             2        1.00 14-JUN-07 BASE     TTL_AMT   117.00
             2        1.00 14-JUN-07 BASE     TTL_DIS    23.40
             2        1.00 14-JUN-07 BASE     TTL_TAX     5.62

6 rows selected.

Lastly Two rows into Three rows (a double pivot example):

You will notice that there is really nothing new here. We just did both pivot operations. First we coded the sql for one pivot, and then we embedded it into the sql for then next pivot. So, the first pivot turns two rows into one row removing a row type discriminator naturally present in the data. Then the second pivot turns one row into three rows, adding back a new row type discriminator to distinguish rows with. Can you tell me what the unique key to our data is after these pivots?

Some might ask if the order of operations is important. For example, can we do the second pivot shown here first and the first second? Answer is sure you can. Makes no difference. The end result will be the same. I would suggest to you that you write this code as an exercise. It will ensure you understand what you have just read. Depending upon how smart Oracle is though, there might be a performance difference. Maybe someone could look into that for us and give us a test case to work with.

SQL> select *
  2  from retail_sale
  3  /

                                    Currency
RETAIL_SALE_ID TERMINAL_ID SALE_DATE Category TTL_AMT TTL_DIS TTL_TAX
-------------- ----------- --------- -------- ------- ------- -------
             1        1.00 14-JUN-07 ORIGINAL  100.00   20.00    4.80
             2        1.00 14-JUN-07 BASE      117.00   23.40    5.62

2 rows selected.

select terminal_id
      ,sale_date
      ,case
           when dollar_type.rowno = 1 then 'TTL_AMT'
           when dollar_type.rowno = 2 then 'TTL_DIS'
           when dollar_type.rowno = 3 then 'TTL_TAX'
       end dollar_name
      ,case
           when dollar_type.rowno = 1 then base_ttl_amt
           when dollar_type.rowno = 2 then base_ttl_dis
           when dollar_type.rowno = 3 then base_ttl_tax
       end base_amt
      ,case
           when dollar_type.rowno = 1 then original_ttl_amt
           when dollar_type.rowno = 2 then original_ttl_dis
           when dollar_type.rowno = 3 then original_ttl_tax
       end original_amt
from (
      select
             terminal_id
            ,sale_date
            ,sum(case when currency_category_code = 'BASE'     then ttl_amt end) base_ttl_amt
            ,sum(case when currency_category_code = 'BASE'     then ttl_dis end) base_ttl_dis
            ,sum(case when currency_category_code = 'BASE'     then ttl_tax end) base_ttl_tax
            ,sum(case when currency_category_code = 'ORIGINAL' then ttl_amt end) original_ttl_amt
            ,sum(case when currency_category_code = 'ORIGINAL' then ttl_dis end) original_ttl_dis
            ,sum(case when currency_category_code = 'ORIGINAL' then ttl_tax end) original_ttl_tax
      from retail_sale
      group by
             terminal_id
            ,sale_date
     )
    ,(
      select rownum rowno
      from dual
      connect by level <= 3
     ) dollar_type
order by 1,2,3
/

                      Dollar
TERMINAL_ID SALE_DATE Name     BASE_AMT ORIGINAL_AMT
----------- --------- -------- -------- ------------
       1.00 14-JUN-07 TTL_AMT    117.00       100.00
       1.00 14-JUN-07 TTL_DIS     23.40        20.00
       1.00 14-JUN-07 TTL_TAX      5.62         4.80

3 rows selected.


 
There you have it. This is the fastest way I have found to do these kinds of pivots inside Oracle. No combination of PL/SQL (slow by slow as Tom of ASKTOM Home fame would say), nor objects is any better, at least not that I have seen. I thought I read somewhere of analytics maybe being another way to do pivots but I have not figured that one out yet. If anyone knows this, please post an example.

So these techniques are pretty generic in nature. You use CASE and GROUP BY to pivot from rows to columns. You use CASE and ROW DUPLICATION to pivot from columns to rows. You subtract or add a rowtype discriminator in the process as needed, and also alter your primary and/or unique keys at the same time.

Check out these links for more information

http://www.orafaq.com/node/1871

http://tkyte.blogspot.com/2007/03/stringing-them-up.html

http://www.akadia.com/services/ora_analytic_functions.html

http://technology.amis.nl/blog/?p=1207

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

%d bloggers like this: