Interest in solid-state storage is high, and with a variety of solid-state implementations available and newer technologies emerging, it's time to take a serious look at how solid state could enhance your storage environment.
Data storage professionals considering solid-state storage have myriad solid-state storage architectures to consider, including systems that use solid-state drives (SSDs) in various form factors, caching implementations and appliances. If that wasn't enough to ponder, those planning on implementing these systems need to decide whether to use a product that mixes solid-state storage and traditional disk drives, or to use SSD-only storage subsystems.
But perhaps more important than just choosing the hardware, enterprises need to decide what data to put on solid-state storage or consider using some form of software automation to move the data onto solid-state storage to make the most efficient use of what is still a somewhat expensive resource. Deciding what data to place on solid-state storage and how to put it there makes choosing a solid-state storage option more complex, but your selections will have a long-term impact.
Solid-state-only shops: Not so soon
In a few decades, some form of solid-state storage may be the dominant and possibly only form of enterprise data storage. But given the present state of matters, that day is (at best) on the distant horizon. We might dream of replacing all of our electro-mechanical disk drives with solid-state storage if cost weren't a factor, but there's nowhere near enough semiconductor fabrication production capacity available today to satisfy the total storage capacity that's deployed in IT shops.
But there are some promising signs. Enterprise solid-state storage prices are dropping relative to enterprise hard disk drives (HDDs). Not that long ago, enterprise solid-state storage was as much as 40 times the price of an equivalent capacity of enterprise hard disk drive storage. The price comparison ratios are in the neighborhood of 25% to 50% of that today, depending on specific solid-state storage products.
As a result of this pricing and capacity disparity, data storage managers and administrators are finding that solid-state storage complements existing traditional forms of storage. They've deployed, or are planning to deploy, solid-state storage where high performance, low latency or energy savings are needed.
There are two basic ways to implement solid-state storage technology:
- Use solid-state storage directly as a primary store
- Use solid-state storage as a cache in front of spinning disks
Each of these implementations has its advantages and disadvantages, and implementations vary among storage vendors. And some vendors offer one implementation now while planning to offer the other in the next six to 12 months.
|Form factors and interfaces|
Solid-state storage comes in a variety of form factors, including nearly all the disk-drive form factors, as internal modules within a storage system or as a PCI Express bus card. The PCI Express bus form factor provides the potential for very high bandwidth storage access within a server or workstation.
Enterprise solid-state drives are available in 2.5-inch and 3.5-inch drive form factors that are compatible with today's servers and storage systems. The primary interfaces for these are SATA, SAS and Fibre Channel (FC). The SATA interface is available for many solid-state drives, especially for the consumer and desktop market. Fibre Channel has a long future as a SAN interface, but is approaching end-of-life as a disk drive interface. Disk drive suppliers and solid-state storage suppliers are moving away from Fibre Channel as a drive interface in favor of 6 Gbps SAS as an enterprise drive interface. We expect the Fibre Channel interface on 3.5-inch drives to stick around for a while to maintain spare parts on the relatively large number of 3.5-inch FC drives in enterprise disk subsystems. And we also anticipate that relatively few 2.5-inch enterprise drives will have a Fibre Channel interface.
Using solid state for primary storage
For vendors that implement solid-state storage directly as a primary data store, many use the standard disk-drive form factor. This implementation method is simple to understand and is compatible with current subsystem designs and configuration processes. The one downside to this approach is that many of today's controllers and subsystems weren't designed for disk drives with an order of magnitude of faster performance at the drive level, so vendors typically don't support a large system completely full of solid-state disk drives. But this is changing as vendors design and build improved controllers that can handle many more solid-state drives. The good news is that significant performance gains can be achieved with a relatively small number of SSDs, often only one full or partial drive shelf. Some users are reporting five to eight times performance gains for some workloads with a relatively small amount of solid-state storage.
We're also seeing an increasing number of solid-state-only storage products available today and planned for release over the next several months. These systems are designed to use solid-state storage as the primary store, with capacities in the single- or double-digit terabytes today and larger capacities coming soon.
For users who have implemented solid-state storage as a primary store, the big question focuses on what data to put on the solid-state storage. There are some obvious candidates, such as database indexes, heavily accessed database tables or temporary scratch areas, log files or any other hot spot. However, this is often not a static solution. Some data that's hot today may not be hot tomorrow. So storage administrators, database administrators or other IT technicians may have to continually monitor data usage patterns and be prepared to make adjustments on a fairly regular basis. In some cases, this increased management burden may be too much work and operational expense to be worth the tradeoff for increased I/O performance.
The answer is to provide an automated way for the storage system to identify the hot data and move it onto the solid-state storage automatically, then move it to slower storage when it no longer requires solid-state performance. Many vendors provide forms of tiering software that does exactly that. This software observes the I/O patterns for a time and then moves the data in a way that's transparent to the host applications. Many of these automated solutions allow the administrator to determine what activity level defines "hot" data, set the time period over which the observations are made, and then set a separate parameter that controls the frequency of data movement (anywhere from hourly to weekly). Some of this software has the ability to make recommendations about the data tiering based on the observations it has made, such as recommending a 10%/90% mix of solid state vs. spinning disk. Today, many of these automated data movers perform the data movement at the LUN level; sub-LUN-level data movement is expected from several vendors within the next six to 12 months.
The solid-state-only storage products eliminate the need to move data from faster to slower storage because all of the data is on fast storage. These systems appeal to customers who want to put an entire application and its data on solid-state storage. At today's price points, these solutions tend to be deployed for critical applications only. The decision (and budget) to acquire them tends to come from line-of-business owners or architects rather than from the IT department.
Caching with solid state
The other basic implementation is to use solid-state storage as a cache in front of spinning disks. This method has the advantage of always accelerating the hot data in real-time, since only the hot data is likely to be in cache. And because the solid-state storage is acting as a cache, there's no need for an administrator to decide what data should be placed on it. The basic questions here are what size cache is appropriate and which workloads should be directed toward the cache to make the best use of the solid-state device.
Some solid-state caching solutions are built into existing storage systems, while others are delivered as external appliances. Adding flash memory as a cache inside a storage subsystem in effect provides a "level 2" cache not unlike the L2 cache found on many processors inside today's computers. This added cache capacity improves performance for most if not all operations. In addition, because flash memory is non-volatile, this cache provides some extra protection in the event of power loss. But issues such as cache coherency, and whether the cache is DRAM based or flash memory based, remain. Generally, a cache is tied to one processor or controller, and there are various cache management functions that can be applied to allow caches to work properly with multiple processors or controllers. In addition, storage systems that use caching can add special features to their internal OSes that are aware of the cache and can provide additional flexibility, such as the ability to assign different I/O priorities for I/O going to different volumes on the storage system.
The caching appliances add the benefits of cache without requiring changes to any existing servers or storage systems. These appliances fit easily into the storage network and can accelerate all I/O going through them, even sending data to different storage subsystems at the same time. Many of the appliances can be set to write-back, write-through or pass-through for any given volume they accelerate. Some of the caching appliances are constructed in such a way as to allow their memory modules to be hot-plugged, so maintenance or growth can occur without taking down the entire appliance.
The big question for a caching implementation is how much cache is enough. For many workloads and applications, a relatively small amount of cache (5% to 20%) relative to the total storage allocated to that application is enough to provide significant performance improvements. For other workloads, the cache needs to be large enough to hold the entire volume to achieve appreciable performance gains.
It's all about performance
Solid-state storage, however it's deployed, offers the promise of significant performance gains. We've seen results of seven to nine times overall performance gains in our lab testing for various real-world applications (email, database, etc.) when configured optimally for the application.
With performance gains of that magnitude possible, what's not to like? Certainly, pricing is a factor. However, consider some of the current methods that are used to increase performance for spinning disk drives, such as "short stroking" spinning disk drives. Short stroking spreads data over many disk drives by using only a portion of the capacity of each drive for data, so that as many "spindles" as possible can be applied to improve performance. To achieve desired performance goals, some users short stroke some of their enterprise disk drives using ratios of 7:1, 8:1 or 9:1, which means they're using only 1/7th, 1/8th or 1/9th of the available capacity on each drive. If the price of an enterprise SSD is 10 to 15 times the price of the spinning drives being short stroked, it may make sense to move that application data to enterprise SSDs and get the required performance while using much less power and space.
Almost all data storage system vendors now offer configurations that use a combination of solid-state storage and enterprise SATA storage instead of arrays full of enterprise spinning disk drives. These new configurations typically offer higher performance, equivalent capacity, lower power consumption, smaller space requirements and lower total hardware costs.
BIO: Dennis Martin has been working in the IT industry since 1980, and is the founder and president of a computer industry analyst organization and testing lab.