Compression is an efficient way of reducing storage costs, slowing down the growing cost of storage, but may also impact performance. In order to lower the performance impact, we are interested in when is the data compressed and which method is used for compression.
When to compress
Storage arrays may compress the incoming data either before or after the write operation is acknowledged to the host. When the data is compressed before the acknowledgement, the storage array benefits from low cache usage, in the expense of the following:
- Performance – Compressing the data takes time, which means the host experiences higher latency, which reduces its ability to take advantage of the storage array performance capabilities
- Increased cost of cache hits - data that was just written usually has a good chance of being read in the short time after it was written (and often read several times). This yields the much desired low-latency cache hit. However, if the data is compressed, it has to be decompressed anew for each cache hit. The repeatedly decompression results in high that may undo the performance gains that could have resulted from the low-latency cache hits.
To avoid these two considerable issues InfiniBox leverages a very large write cache that allows for compressing data only during its destage to disk.
As a result, InfiniBox not only reduces the latency of writes, but in many cases avoids compression altogether, further avoiding undesirable performance impact.
Compression used to be a CPU-intensive task, but improvements in CPU architecture and compression algorithms have reduced the CPU cycles required to compress data.
InfiniBox uses LZ4, the most common algorithm to compress data. However the system was designed to allow adding more compression types later, as more efficient compression algorithms become available, or new CPU capabilities further reduce the CPU overhead of existing compression algorithms.
Size of compressed data
The InfiniBox case for not compressing the data until destaged is backed by the nature of the compression algorithms.
Simply put, compression algorithms try to find strings of data that repeat as many times as possible within the data and replace them with shorter strings, all the while keeping a dictionary of these replacements.
During the decompression, each string that appears in the dictionary is replaced with the original (longer) string, thus the original data is restored.
The more frequent the data string, the more effective the compression. When comparing the compression of a string of 1 million identical characters to 1000 strings of 1000 identical characters each, we find that contemporary algorithms keep one dictionary in the first case, compared to 1000 dictionaries in the later case. An improved compression algorithm may have to compare longer strings of characters, possibly increasing the compression ratio in the expense of a performance impact. In addition, the compression algorithms are susceptible to reach a plateu in their ability to improve both performance and compression ratio.
As InfiniBox compresses 64KB of data at a time (compared to the commonly used 4KB-8KB) it reaches a higher compression ratio at a lower performance impact for the same source data.
Compressing data creates a need to show the capacity savings created by compression and zero reclaim. InfiniBox 3.0 features a new capacity counter to volumes, filesystems and pools called "Data reduction".
Data reduction is displayed as a ratio between the capacity that would have been consumed by the entity, and the capacity that is actually consumed (including both capacity associated with the entity itself and its snapshots).
Accounting the savings
While compression works on both thick and thin entities, its behavior is different in each case:
- Thin entities: All data written is compressed, savings from data reduction remains available in the pool's physical capacity
- Thick entities: Data is written compressed, but the savings remains allocated to the entity (can't be reused.) Only data in the entity's snapshots (which are thin) remains in the pool's physical capacity.
In the following example, "vol1", a 1TB thin volume is mapped to a host. The host writes 800GB of data (assumed to be 2:1 compressible), and fills 100 of the remaining 200GB with zeros:
Host reported capacity:
- Zeros are automatically reclaimed, so the 100GB of zeros does not consume space.
- The 800GB is compressed by a ratio of 2:1.
- The 1TB was reduced to 400GB on disk out of the 900GB the user wrote, yielding a 2.25:1 data reduction ratio
Using capacity freed from data reduction
Once entities enjoy data reduction they effectively free capacity that needs to be reused. The amount of capacity that is freed returns to the pool. Now, it is up to the storage admin to decide whether to:
- Leave the freed capacity in the pool, allowing any of the pool admins to use it
- Reduce the pool size, allowing the storage admin to reuse that capacity elsewhere.
Data reduction ratios vary, depending on the dataset.
For example, virtual machines and databases tend to include many zeros, increasing the data reduction, while data encrypted before writing it to the storage does not compress well.
The maximum available reduction ratio is 16:1.
Compression per entity is inherited from its pool during creation time, and can be changed at any time. By default all pools after upgrading to 3.0 are set to "compression=on", so all new entities are created with compression (unless manually changed).
Existing entities in the pool (Created before 3.0) can be manually set to compress new data.
When a new pool is created it inherits its compression behavior from a system default for compression (set to "on", can be overridden).
InfiniBox employs multiple mechanisms to make sure performance is not degraded as a result of compression:
Any write to the storage is first accepted in memory to guarantee minimal latency, and only compressed as data is destaged from RAM to persistent media.
Since many writes get overwritten while in the cache, this InfiniBox avoids a lot of unnecessary compression.
Once the data is destaged, data is compressed in sections (sets of consecutive 64KB) increments, creating a large enough dataset to achieve effective compression, but still keeping each sections separated. This allows the system to retrieve any section without having to retrieve the entire stripe.
All reads from RAM and SSD don't require decompression as data is not compressed while in cache. This maintains low latency.
Data retrieved from disk will require decompression when its loaded into the cache.