Performance Considerations¶
Although size reduction is the main (and only) reason to switch on compression, it makes sense to look at the various option and how they affect the load times.
Size Reduction¶
Below is table with the size reduction compared to a simple uncompressed table. These numbers have been compiled from results published by Uwe Hesse (A), Yannick Jaquier (B), Arup Nanda (C), and Talip Hakan Öztürk (D). The values marked with an asterisk have not been included in the mean size reduction.
Option |
Size (%) |
Size A (%) |
Size B (%) |
Size C (%) |
Size D (%) |
---|---|---|---|---|---|
|
48 |
N/A |
65.63 |
47.93 |
29.41 |
|
59 |
N/A |
78.68 |
56.26 |
41.17 |
|
19 |
14.70 |
22.47 |
19.79 |
N/A |
|
16 |
8.82 |
13.44 |
0.52 |
41.17 |
|
10 |
6.62 |
12.81 |
0.52* |
N/A |
|
8 |
4.41 |
11.13 |
0.33* |
29.41* |
The numbers shown depend heavily on the data, so always be sure to try the different options with your own data! The percentages should only be read as an indication of the potential storage savings, not the gospel of Saint Oracle. You have been warned.
It’s clear that the least compression is offered by OLTP
, which is not surprising.
Similarly, we see that in most cases the hybrid column compression shaves off most disk space.
CPU Overhead¶
What about data modification operations?
Talip Hakan Öztürk and Uwe Hesse have done extensive tests with the various options.
Again, below are the summarized insert times compared to the BASIC
option (not shown).
Option |
Insert time (%) |
Time A (%) |
Time D (%) |
---|---|---|---|
|
316 |
289.4 |
343.1 |
|
69 |
70.35 |
67.73 |
|
117 |
147.9 |
85.40 |
|
145 |
201.2 |
88.29 |
|
710 |
783.5 |
636.0 |
The insert for the OLTP
option is a conventional insert because in an OLTP environment a direct-path insert does not make sense.
For the other compression methods the inserts are direct-path inserts.
Note that conventional inserts generate a lot more undo and redo, which in itself is a cause for concern, especially with regard to performance.
Although there is quite a bit of variation in the insert times compared to the basic compression option, it seems that QUERY LOW
achieves both a substantial size and time reduction.
The most variation is visible for the hybrid columnar compression methods.
Yannick Jaquier reports that CPU time increases by 6.5 percentage points for the insert into a table with either ARCHIVE HIGH
or QUERY HIGH
compression as compared to the uncompressed case.
However, the number of consistent gets for a full table scan are 88% and 90%, and the physical reads are 30% and 39% of that of an uncompressed table, respectively.
Uwe Hesse has similar numbers.
A full table scan takes 86.5% for BASIC
, 87.0% for OLTP
, 41.5% for QUERY LOW
, 61.3% for QUERY HIGH
, 54.6% for ARCHIVE LOW
, and 141.5% for ARCHIVE HIGH
of the baseline’s time, which is for an uncompressed table.
So, it’s true that decompression does not affect read operations.
With the exception of ARCHIVE HIGH
most compression options do not affect query and insert performance too badly.
It is possible (and likely) that the reduced I/O makes queries faster.
Nice!
Nevertheless, when you’re running an OLTP database, it’s probably best that you stay away from compression altogether.