Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

All-flash array comparison: The keys to performance, capacity success

Analyst George Crump provides tips on selecting an all-flash storage array that meets your environment's performance and capacity needs.

One of the challenges all-flash storage array manufacturers face is differentiating their products from those of...

their competitors. After all, the storage media is essentially the same. Most all-flash arrays use multi-level cell flash.

As a result, all-flash array (AFA) vendors often end up fighting over what are sometimes minor or irrelevant performance differences in their product offerings. This tip offers key things to look for in an all-flash array comparison.

Core features: Thin provisioning, snapshots, cloning

Most all-flash array vendors now provide the basic core features that data centers have come to expect from their storage systems. These features include thin provisioning, snapshots and clones (writable snapshots). On hard drive-based systems, these features had to be applied judiciously as they could impact application performance. For example, thick provisioning of virtual machines used to be considered a best practice for performance-sensitive situations.

All-flash array vendors often end up fighting over what are sometimes minor or irrelevant performance differences in their product offerings.

Flash in general, and all-flash storage arrays in particular, changed all that. These features all place extra write loads on the storage system as data is created or updated. Flash responds more quickly to write I/O than hard disk drives, so these features have less of an impact on performance. Combine this with the fact that AFAs provide far more performance than the typical data center needs, and you end up with a feature set that can be deployed with almost no concern about performance impact.

In general, these core features are now commonplace in AFAs and their use is encouraged. There is nothing in the features themselves that should cause a concern. However, IT planners should spend time understanding how each all-flash array vendor delivers its features.

Integrated features vs. add-on features

All-flash storage array vendors have used two methods to deliver core features to their customers. Some vendors like Pure Storage Inc., SolidFire Inc., EMC XtremIO and even Dell Compellent have essentially written their storage software from scratch and integrated these features into their offerings. In each of these cases, the vendor uses off-the-shelf hardware to install their software. These vendors tend to look at themselves more as software vendors than as hardware vendors.

Vendors like Violin Memory Inc., IBM (Texas Memory) and Tegile Systems Inc. leverage an external source for the delivery of some or all of their data services. Both IBM and Violin use an external hardware appliance, and Tegile leverages a ZFS foundation as its source and then adds features like data deduplication, compression and metadata management on top of that.

The integrated solutions should provide a more seamless look and feel to the system and should do so less expensively. On the flipside, IBM and Violin arrays can be easily set up as a raw flash array with no additional features. For environments with a need for extreme performance (plus 500k IOPS), this can be a very attractive option.

Comparing scale-out and scale-up all-flash arrays

AFA vendors are starkly divided on the value of scaling all-flash arrays. With a scale-up array like those from Tegile, Violin, IBM and Pure Storage, you are essentially buying all the performance potential of the array upfront. The only component that can typically be added is more drive shelves.

The concern with a scale-up system is that the data center will reach the limits of capacity, performance or both and then be required to buy a whole new system. However, even midrange scale-up all-flash arrays provide hundreds of thousands of IOPS. Also, systems typically include deduplication and/or compression, so assuming some measure of storage efficiency gain, they can scale to dozens if not hundreds of terabytes. It is also important to note that an increasing number of scale-up all-flash arrays allow users to replace old controller heads with new ones so that performance and capacity can be scaled in tandem. Regardless, when choosing a scale-up all-flash storage array, it is important to carefully consider whether the system's capacity and performance potential will meet your needs going forward.

Scale-out systems like those from SolidFire, EMC XtremIO and Kaminario Inc. are built from a cluster of servers that host the storage software and are aggregated to provide performance and capacity. Most of these systems have to start with three nodes, which may be overkill for the typical data center. The good news is that XtremIO and Kaminario both allow for a blended model. They can start as a single node scale-up design and then, if performance and/or capacity limits are reached, they can shift to a scale-out mode. SolidFire counteracts this with a very small three-node cluster that will allow the intermixing of high- and low-density node sizes within that same cluster.

The role of high availability

It stands to reason that an all-flash storage array is more than likely a mission-critical system. Consequently, the availability of the AFA is also a critical design feature so most all-flash arrays offer high availability (HA) out-of-the-box. Scale-out systems provide HA via the very nature of their scale-out cluster configuration. If one node fails, the others continue serving data while a redundant node is rebuilt in the background. The other advantage of this approach is that performance in a failed state is only decreased by one minus the number of nodes so, in a larger cluster, the performance impact should be minimal.

Scale-up systems will take one of two approaches: active-active or active-passive. In an active-active approach when both storage controllers are operational, they are both responsible for serving up data. In other words, the performance load is distributed between them. The downside to this approach is that, if there is a failure, the entire I/O load goes through a single controller and, in theory, performance could degrade by 50%.

In an active-passive approach, the second controller is essentially idle waiting for the primary controller to fail. If the primary controller does fail then the workloads are all shifted to the secondary controller. While this does mean that there is idle hardware, it also means performance is very consistent in a failed state.

This brings up an additional advantage of scale-out storage. Some workloads could be configured to survive multiple controller (node) failures, whereas in a scale-up system -- while very unlikely -- a failure of the second controller, before the first one is replaced, would result in application downtime.

Data efficiency vs. pure performance

When comparing all-flash arrays, most of them provide some level of data efficiency. Many vendors, like Pure Storage, Kaminario, Violin, Tegile and XtremIO provide both deduplication and compression, while others, like Hewlett-Packard and IBM, provide one or the other. Most data centers will see some benefit from one or both of these data efficiency features. Virtual environments, desktop and server will benefit from deduplication, whereas database environments will benefit more from compression.

There is also some discussion around the ability to turn these data efficiency features off. Vendors like Kaminario, Violin and IBM can do this. The theory being "why spend the time applying data efficiency to a particular data set if there will be no material benefit?" If the feature can be turned off, this should lead to better performance and low storage system costs. For some data centers, this capability may make a difference. But with others, there will be little difference since the AFA, even with all these features turned on, still provides more performance than they need. In these instances, the idea of "set it up once and forget it" probably becomes more appealing.

Conducting an all-flash array comparison can be a daunting process, but it does not have to be. The key to selecting the best, most affordable system for your environment is to understand how much performance and capacity you need today, while trying to establish a range of how much performance you will need over the next five years or so. Then understand how these systems will scale to meet those requirements.

About the author:
George Crump is president of Storage Switzerland, an IT analyst firm focused on storage and virtualization.

Next Steps

Avoiding hidden gotchas when buying an all-flash array

Is a hybrid or all-flash array better for your environment?

AFA product path: Buy or retrofit?

This was last published in February 2015

Dig Deeper on SSD array implementations

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

6 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

What do you look for in a storage array?
Cancel
A storage array basically uses a flash memory, coupled with a hard disk drive in creating a storage that balances capacity, performance and cost. It's important to consider the business value of the storage array. This is because the desired business results can be hampered by inadequate storage performance. I would therefore get a hybrid storage array since the old storage architectures are often difficult to manage; they hardly keep up with the customer needs.
Cancel
Price and reliability. Not sure what else there would be. 
Cancel
Very good article.
One point I would add.
Or rather break performance into it.
It is very true IOPS and BW are not a consideration anymore in choosing an AFA.
What is (and we hear it over and over again) ?
Latency.
Latency of the 1st 4K read.
So what are the parameters which influence it :
1.The data path (what does the data go through from the NAND and out) - decompression ? moving inside the NAND? some processing ?
2.The speed of the NAND.
The latter is a very important parameter which is relatively easy to check in a sanitary manner.
Only thing to do is do a read request while all other services of the AFA are down.
It will give you a good measurement on how the AFA will respond to 4K reads.
In case it is good enough - great.
In case it is not good enough, one way to go around it is to put a PCIe (Fusion-IO, HGST) in front of the storage and cache it.
It can reduce the latency tenfold.
Cancel
The key to understanding the differences in the AFA products is to understand how they perform when running your specific workloads. It's always best to run these workloads with features like dedupe and compression enabled to see how they truly perform. It's also fairly easy to test controller failures under different MPIO assumptions. This is where companies like Load DynamiX come in. They enable unbiased apples to apples performance comparisons to help storage planners choose between vendors and determine the lowest cost configuration that will meet the needs of the workloads.
Cancel
While choosing the best array, I frequently check:

  • Flash type
  • Array capacity
  • Networking technology
  • Warranty/support
  • Storage
  • Management features
Cancel

-ADS BY GOOGLE

SearchCloudStorage

SearchDisasterRecovery

SearchDataBackup

SearchStorage

SearchITChannel

Close