Performance Considerations

Although size reduction is the main (and only) reason to switch on compression, it makes sense to look at the various option and how they affect the load times.

Size Reduction

Below is table with the size reduction compared to a simple uncompressed table. These numbers have been compiled from results published by Uwe Hesse (A), Yannick Jaquier (B), Arup Nanda (C), and Talip Hakan Öztürk (D). The values marked with an asterisk have not been included in the mean size reduction.

Option Size (%) Size A (%) Size B (%) Size C (%) Size D (%)
BASIC 48 N/A 65.63 47.93 29.41
OLTP 59 N/A 78.68 56.26 41.17
QUERY LOW 19 14.70 22.47 19.79 N/A
QUERY HIGH 16 8.82 13.44 0.52 41.17
ARCHIVE LOW 10 6.62 12.81 0.52* N/A
ARCHIVE HIGH 8 4.41 11.13 0.33* 29.41*

The numbers shown depend heavily on the data, so always be sure to try the different options with your own data! The percentages should only be read as an indication of the potential storage savings, not the gospel of Saint Oracle. You have been warned.

It’s clear that the least compression is offered by OLTP, which is not surprising. Similarly, we see that in most cases the hybrid column compression shaves off most disk space.

CPU Overhead

What about data modification operations?

Talip Hakan Öztürk and Uwe Hesse have done extensive tests with the various options. Again, below are the summarized insert times compared to the BASIC option (not shown).

Option Insert time (%) Time A (%) Time D (%)
OLTP 316 289.4 343.1
QUERY LOW 69 70.35 67.73
QUERY HIGH 117 147.9 85.40
ARCHIVE LOW 145 201.2 88.29
ARCHIVE HIGH 710 783.5 636.0

The insert for the OLTP option is a conventional insert because in an OLTP environment a direct-path insert does not make sense. For the other compression methods the inserts are direct-path inserts. Note that conventional inserts generate a lot more undo and redo, which in itself is a cause for concern, especially with regard to performance.

Although there is quite a bit of variation in the insert times compared to the basic compression option, it seems that QUERY LOW achieves both a substantial size and time reduction. The most variation is visible for the hybrid columnar compression methods.

Yannick Jaquier reports that CPU time increases by 6.5 percentage points for the insert into a table with either ARCHIVE HIGH or QUERY HIGH compression as compared to the uncompressed case. However, the number of consistent gets for a full table scan are 88% and 90%, and the physical reads are 30% and 39% of that of an uncompressed table, respectively.

Uwe Hesse has similar numbers. A full table scan takes 86.5% for BASIC, 87.0% for OLTP, 41.5% for QUERY LOW, 61.3% for QUERY HIGH, 54.6% for ARCHIVE LOW, and 141.5% for ARCHIVE HIGH of the baseline’s time, which is for an uncompressed table.

So, it’s true that decompression does not affect read operations. With the exception of ARCHIVE HIGH most compression options do not affect query and insert performance too badly. It is possible (and likely) that the reduced I/O makes queries faster. Nice!

Nevertheless, when you’re running an OLTP database, it’s probably best that you stay away from compression altogether.