Problem
You need to convert a table of rows into a table of columns. Usually associated with a join.
Example
Say you have two tables:
event_table
| row_id | event_type |
|---|---|
| 1 | Example 1 |
| 2 | Example 2 |
| 3 | Example 3 |
answer_table
| row_id | question_id | answer |
|---|---|---|
| 1 | 1 | Answer 1 - 1 |
| 1 | 2 | Answer 1 - 2 |
| 2 | 1 | Answer 2 - 1 |
| 3 | 3 | Answer 3 - 3 |
Desired result
| row_id | event_type | answer_1 | answer_2 | answer_3 |
|---|---|---|---|---|
| 1 | example_1 | Answer 1-1 | Answer 1-2 | |
| 2 | example_2 | Answer 2-1 | ||
| 3 | example_3 | Answer 3-3 |
Solution
Oracle offers a very useful pivot() function, but it has some problems:
- It deals better with aggregate numbers (sum(), max(), etc). For example counting employees in a department or count of reports generated by section.
- It can have some weird issues if you only want to pull a subset of the results (for example you only want questions 1 and 2).
- If you dont use it often, it is a pain to have to re-learn how best to use it everytime (especially trying to figure out how to get it to return named columns rather than 1,2 in the example above).
A very simple way around the problem for relatively simple queries like the above is to join the two tables then use list_agg to "compress" the answer table.
SELECT ET.row_id, ET.event_type,
LIST_AGG(CASE WHEN ANS.question_id = 1 THEN ANS.answer END) AS answer_1,
LIST_AGG(CASE WHEN ANS.question_id = 2 THEN ANS.answer END) AS answer_2,
LIST_AGG(CASE WHEN ANS.question_id = 3 THEN ANS.answer END) AS answer_3
FROM event_table ET
LEFT JOIN answer_table ANS ON ET.row_id = ANS.row_id
GROUP BY ET.row_id, ET.event_type
This method breaks down when the list of pivot columns gets big or if the tables get too large, but works well for smaller tables (thousands of rows) or when the pivot involves relatively few rows to columns.
It has the advantage that the developer can have total control over which questions are returned (in a recent example we needed only 10 rows per event out of 100 possible answers) and the names of the columns can be controlled so the rest of the code knows what to look for.