Wildlink's Technology Blog

An occasionally updated list of informative articles.
Pulled from our internal wiki.

Oracle - Simple Pivot Query

Article Information

Article Content

Problem

You need to convert a table of rows into a table of columns. Usually associated with a join.

Example

Say you have two tables:

event_table

row_id event_type
1 Example 1
2 Example 2
3 Example 3

answer_table

row_id question_id answer
1 1 Answer 1 - 1
1 2 Answer 1 - 2
2 1 Answer 2 - 1
3 3 Answer 3 - 3

Desired result

row_id event_type answer_1 answer_2 answer_3
1 example_1 Answer 1-1 Answer 1-2
2 example_2 Answer 2-1
3 example_3 Answer 3-3

Solution

Oracle offers a very useful pivot() function, but it has some problems:

  • It deals better with aggregate numbers (sum(), max(), etc). For example counting employees in a department or count of reports generated by section.
  • It can have some weird issues if you only want to pull a subset of the results (for example you only want questions 1 and 2).
  • If you dont use it often, it is a pain to have to re-learn how best to use it everytime (especially trying to figure out how to get it to return named columns rather than 1,2 in the example above).

A very simple way around the problem for relatively simple queries like the above is to join the two tables then use list_agg to "compress" the answer table.

    SELECT  ET.row_id, ET.event_type,
            LIST_AGG(CASE WHEN ANS.question_id = 1 THEN ANS.answer END) AS answer_1,
            LIST_AGG(CASE WHEN ANS.question_id = 2 THEN ANS.answer END) AS answer_2,
            LIST_AGG(CASE WHEN ANS.question_id = 3 THEN ANS.answer END) AS answer_3
      FROM  event_table ET
      LEFT JOIN answer_table ANS ON ET.row_id = ANS.row_id
     GROUP BY ET.row_id, ET.event_type

This method breaks down when the list of pivot columns gets big or if the tables get too large, but works well for smaller tables (thousands of rows) or when the pivot involves relatively few rows to columns.

It has the advantage that the developer can have total control over which questions are returned (in a recent example we needed only 10 rows per event out of 100 possible answers) and the names of the columns can be controlled so the rest of the code knows what to look for.