Essential guide to hybrid flash arrays
A comprehensive collection of articles, videos and more, hand-picked by our editors
Solid-state drive (SSD) storage systems are sexy, blazingly fast, power misers, cooler in temperature, hotter in perception and costing less than most IT storage pros expect. But the question often arises as to which is the better buy: a hybrid SSD-HDD storage system or an all-SSD storage system. Both types of systems have unique benefits, so picking one over the other will come down to the requirements and organizational priorities for IOPS, throughput, capacity, functionality and, of course, cost.
Hybrid SSD and HDD storage systems
A hybrid storage system typically combines the performance of SSDs with the capacity of HDDs. SSDs are, on average, 1,000 times faster than HDDs. Hybrid systems are designed to enable both high performance for those applications requiring high IOPS or high throughput and low-cost capacity for data that doesn't need the performance of SSDs. This predictably produces a very low cost per gigabyte, per IOPS and GBps throughput. Delivering a quality, capable and scalable hybrid system is non-trivial.
Remember, first and foremost, that the hybrid storage system is a storage system. It must offer the storage system feature/functionality to meet the organization's requirements and include features such as: thin provisioning, data reduction (dedupe and/or compression), snapshot, snapshot replication, mirroring, RAID, wide-stripe RAID and some form of automated data migration. Most of these features/functions are storage processor-intensive, causing performance bottlenecks, reducing SSD IOPS and throughput. The most egregious of them are thin provisioning, deduplication, compression and RAID drive rebuilds.
Caching is usually write-through with a smart algorithm tracking reads and placing hot data into the SSD cache for 20x faster reads. In some cases, data can be pinned in the cache (so it can't be automatically moved out of cache). Write-back cache is when the writes are captured and acknowledged from the SSD cache before they're written to the HDDs. Write-back cache is much less common than write-through cache. This is because all writes land on the SSDs, increasing the write-erase cycles and shortening the viable wear life of the SSDs. It also requires more SSD capacity.
Some hybrid storage systems include extensive DRAM in the mix and, in some cases, even use NVRAM. These combinations provide multiple levels of caching and even higher levels of performance. This comes from the DRAM and NVRAM performance edge of being approximately 20 times faster than SSDs, albeit at a significantly higher cost.
Auto tiering moves data between SSD and HDD storage tiers automatically. Movement between tiers is based on user policies about the age of the data, frequency of access, organizational value of data access response time and more. Data movement can be based on LUNs, sub-LUNs, data stores, files, objects, partial files, or partial objects. Smaller atomic unit movements (chunks, chunklets, slices, blocks, etc.) equal greater levels of storage efficiency. Auto tiering must be capable of moving data either up or down on demand, with no administrator intervention.
Pros: Low latency to SSD cache or storage tier. Provides exceptional scalability into the petabytes. Very low cost per GB, IOPS and throughput. Delivers much greater IOPS and throughput than traditional HDD systems at a comparable price, in addition to a lower TCO. Lower TCO comes from reduced power and cooling of the SSDs plus fewer and fatter HDDs.
Cons: Storage controllers are not optimized for SSD IOPS and throughput, so there are hard limits to the effective number of SSDs per system. This limitation means reduced cache hits as data sets grow. Reduced cache hits means more redirects to the much slower HDDs, higher latency, longer response times and unhappier users. For auto tiering, it means more extensive and frequent movement between tiers and always playing catch-up.
100% SSD storage systems
Just like the hybrid storage system, the 100% SSD storage system is, first and foremost, a storage system. Feature/functionality is important and the same issues of storage processor bottlenecking, subsequently reducing SSD performance, etc., apply. The beauty of 100% SSD storage systems is that there is no need for SSD caching or auto tiering software. At least, there isn't for now. Expect that to change in the not-too-distant future, as some vendors will tier SLC, eMLC, MLC and TLC SSDs to provide a solid-state variation of auto tiering with different levels of SSD performance, capacity and cost.
Pros: Lowest latencies, lowest response times, highest IOPS and throughput. High performance is consistent across the system. It's simple high-performance storage. Delivers an excellent cost per IOPS, GB/s and TCO, although costs are not as low as for the hybrid storage system.
Cons: Cost per GB (capacity) is much higher than for other types of storage systems. The good news is that it's rapidly declining as NAND die sizes shrink, densities increase and costs decline. And even though upfront purchase costs are still higher, TCO is rapidly closing with both hybrid and HDD storage systems. A big part of that comes from not having any moving parts consuming power and cooling.
System feature/functionality such as data reduction technologies can significantly reduce system performance by as much as 50%, sometimes even more. Capacity scalability is limited on many systems to dozens or hundreds of TBs. Some can scale into the low PBs. Expect capacities to continually increase following the flash NAND development curve.
Choosing between hybrid and 100% SSD systems comes down to organizational priorities. Amount of performance, capacity, functionality and cost will all play important roles in the decision.