A Real-world Example

I first encountered the power of DECODE a number of years ago whilst working on a government project alongside Oracle consultants. We were implementing an Application Form Processing system that would be used to maintain a record of every form sent out and received in one very large EVENTS table with 20-25 million rows (it seemed big at the time). The design was based on a legacy of an earlier system that we had to co-exist with. One of the key functions of the system was the production of management information reports, used to measure performance against key indicators. (e.g. 'How many forms did we send out last week?', 'How long does it take us to send a reply after we have received form S937?', 'Just what are all these forms for?'.)

We were using a very early release of Reports 2 to help us to combine various queries to generate high quality report output with a standard format template. This was initially very successful against our few thousand test records, as these things usually are, until we increased the volume of our test data and realised that our reports would take hours and days to run. In the shadow of a looming deadline, we consulted Oracle for advice and were encouraged to reduce the number of individual queries in each Reports2 Data Model so we spent a couple of desperate days re-thinking our approach. The DECODE function came to our rescue, reducing the execution time of our reports from hours to a few minutes and days to a few hours.

The EVENTS table in the database contained a row for each event during the application handling process. For the purposes of our example we'll use the EVENTS table defined below, which uses a subset of the columns from the original table.

EVENTS Table EVENT_TYPE VARCHAR2(2) EVENT_DATE DATE Which contains the following Receipt Events AX, AH, AB and Send Events TY, LO, LL and Other Events QC, DL, MA, IC, JF

Receipt events were generated whenever a form was received back from the applicant, Send events when the organisation sent a form to the applicant and a small number of other events covered a variety of activities. The report needed to show the number of receipt and send events in the last year and also in the last 2 days. For example,

Total Receipts Total Sends Recent Receipts Recent Sends 2,354,456 1,857,374 23,950 21,349

Our first attempt used one discrete query for each of the values shown in the report, using Reports2 to combine the results. For example, the Total Receipts value was generated using the SQL shown in example 7a (but remember that there were 4 of these queries used in the report, each performing a full scan of our big events table).

Example 7a

SELECT COUNT(1) FROM events WHERE event_type IN ('AX', 'AH', 'AB') AND event_date > ADD_MONTHS(sysdate, -12);

Our first thought about how to improve the report was that, given that we were interested in all the receipt and send events during the past year, why not have one query retrieve both sets of events in one pass of the EVENTS table and then use DECODE to generate two different SUMs for the different events? Once we'd achieved that it occurred to us that, since we were retrieving the event data for the last year, we should be able to use DECODE against the same scan of the table to extract the figures for events in the last 2 days separately. The end result is shown in example 7b.

Example 7b

SELECT SUM(DECODE(event_type, 'AX', 1, 'AH', 1, 'AB', 1)) total_receipts, SUM(DECODE(event_type, 'TY', 1, 'LO', 1, 'LL', 1)) total_sends, SUM(DECODE(GREATEST(event_date, TRUNC(SYSDATE - 2)), event_date, DECODE(event_type, 'AX', 1, 'AH', 1, 'AB', 1) )) recent_receipts, SUM(DECODE(GREATEST(event_date, TRUNC(SYSDATE - 2)), event_date, DECODE(event_type, 'TY', 1, 'LO', 1, 'LL', 1) )) recent_sends FROM events WHERE event_type IN ('AX', 'AH', 'AB', 'TY', 'LO', 'LL') AND event_date > ADD_MONTHS(SYSDATE, -12)

There are several important things to note about this statement. First, notice the way that we use a check against event_type in the WHERE clause to reduce the number of events retrieved before applying the DECODE logic in the SELECT clause. Likewise with event_date, because we only want this year's events. The resulting WHERE clause will return all send and receive events from the past year.

Then the SELECT clause does the real work. If we look at the first column definition, we will see that we are returning the sum of 1 if the event_type is AX, AH or AB (i.e. a receipt event) or NULL (the default) if it is any other type of event. This has the effect of returning the total number of receipt events. The second column definition is almost identical, but for send events. The third column uses a nested DECODE to first check the event_date to see if it is within the past 2 days and, if it is, an inner DECODE to test whether it's a receipt event. If both are true, 1 is added to the SUM, if either is not true, nothing (NULL) is added to the SUM, giving us the total number of receipt events in the past 2 days.

The changes resulted in the report running almost 4 times more quickly, because we were able to produce the results of 4 separate queries using only one query which scanned through the same data and then sub-divided it to produce the 4 separate totals. Remember that this report only contains 4 values, whereas some of the reports contained 25, 50 or 100! This use of SUM, DECODE and return values of 1 or NULL is an excellent way of using one SQL statement to perform large numbers of apparently unrelated COUNTs against one data set, instead of performing numerous SELECT COUNT(1) queries with different WHERE clauses. This technique is particularly useful for MIS reporting against data warehouses.


The Power of Decode