This article is part of an Essential Guide, our editor-selected collection of our best articles, videos and other content on this topic. Explore more in this guide:
2. - On-site video interviews: Read more in this section
- Upgrades expected, but few takeoffs from the launch pad at EMC World
- George Crump on EMC's DSSD acquisition
- EMC ECS aims to take down Amazon, Google
- EMC World day one: DSSD holds attention
- DSSD acquisition leads SSD interest
- Matchett: HDFS offerings from EMC cater to several sweet spots
- McClure's take on Syncplicity updates
- George Crump on dedupe's effect on performance
- EMC's Emsley discusses modernization of data protection
- The 411 on ViPR's data services
Explore other sections in this guide:
Keep EMC XtremIO performance comparison in perspective, analyst saysDate: May 07, 2014
That comparison drew the attention of George Crump, lead analyst at Storage Switzerland, who termed it "one of the more controversial things" that came up at the conference.
"It's a very interesting graph and it's a real statistic, [but] I question the value of showing it," Crump said. "We don't know the details. We don't know who the customer was. We don't know the load. We don't know how full it was. We don't know any of that.
"But what it's showing is really important," he explained. "It shows you [that] at some point the deduplication engine can become part of the problem. So what's happening is, as you get closer to capacity, you're managing more and more data and doing more and more comparison, so the speed at which the data about the data -- the metadata -- can be traversed and compared so you can get an efficient deduplication rate becomes very important."
Crump offered takeaways for all-flash-array users.
"One of the things, if I was an end user, I would pull from this is [that] you can't treat deduplication like a checkbox," he said. "Just the fact that everybody has it or eventually will have it is nice, but these engines are all a little different and will perform different under load and at capacity. As you scale, this is a database -- essentially a metadata table -- and the larger that database comes, the more data it has to track [and] the more it will impact performance."
According to Crump, "the chance of corruption in a corrupted metadata table in deduplication could be very, very problematic. The real takeaway isn't necessarily who the competitor was, but making sure you do the right testing so you understand what the impact of deduplication will be in your environment."