DataCore Software Chairman Ziya Aral, a founder of the company specializing in storage virtualization software, defines software-defined storage in its simplest terms, as a way of breaking the hardware layer from the software. Aral claims his company has been doing this for years, long before software-defined storage became such a widely used phrase.
In this interview with SearchSolidStateStorage, Aral describes why this approach is a good fit for solid-state drives (SSDs) and how it can help SSD performance.
You have called SSDs a revolutionary, yet complicated, technology. Why is it so important, and why is it so complicated?
Ziya Aral: SSD is important because it is fast. For most of our time in this industry, storage has meant disks. Disks are mechanical and, as a result, they're slow. There's an entire complicated fabric for making them fast -- caching and all kinds of lore that we apply to the devices. All of a sudden, here come SSDs, and they're transparent. A customer can simply stick one in a box -- and all of a sudden, it's faster. You boot Windows. You boot Linux. It happens in seconds. Whoa! This is great. Fast storage is very important.
Storage is the black sheep of the computer family. It's always the one lagging behind, because it's fundamentally mechanical. It's the one that slows down applications. Once upon a time, a third of all applications were storage-bound. Now it's probably 95%. Everything else got faster. We didn't.
The result is that any innovation that speeds up mass storage is wonderful, and SSDs speed up mass storage in a practical, commercially viable way. That's the good news. The bad news is they don't work like disk drives. They are complicated.
So if you want to read from an SSD, that's wonderful, but if you try to write to an SSD, it's not as fast. Worse, SSDs don't like being written to. I'm sorry to all my friends in the industry, but they don't. You know it. I know it. They burn out, and they burn out pretty quickly by disk standards.
So now, here's the complexity. SSDs aren't perfect for everything. They're still expensive -- to carry the bulk of your storage. You've still got to think about what you want to do with them in a larger architecture.
Now there are people in the industry, we try to partner with those people, people like Fusion-io, who I think are brilliant. They focus SSDs in a specific application, databases in this case, and they moderate SSDs with two technologies that actually make SSDs perform. One is software. They run caching software in front of their SSD. Second is DRAM. DRAM has a great advantage in that it doesn't care how many times you write to it.
Aside from Fusion-io, is does anyone else stand out for you?
Aral: Not for me.
Aral: I mean, [some] vendors stand out. I love those Samsung guys. But they build a commodity system underlying everyone's SSDs. So, from that standpoint, they don't play at the strategic level yet. Maybe someday they will.
And how big of an impact will software-defined storage have on the SSD market and the performance of solid-state drives?
Aral: Huge. It's huge because it's hard enough to make disk drives work [laughs] and keep the application going, and the whole server virtualization market seems to have gotten stuck on tier-one applications and on virtual desktops.
So along comes this great storage technology. It supplements the conventional disk drive industry in a profound way. Integrated with DRAM, it seems to be able to work, but all of it defeats the schemes that we have been working on for years and years and years.
For example, to get the proper advantage of SSDs, you want the SSD to be local to the application. Most SSDs work that way. Now people are trying to build hybrid arrays and arrays of SSDs. That really sort of loses the advantage.
There's a physics to going over a wire … a lot of the advantage of the SSD disappears in that process. So now we are talking about moving a lot of the software infrastructure into the server and into the network. Well, why [move it] into the network? Because you don't want SSDs written to.
You want the longest delays possible, but long write delays mean that the data has to be in at least two locations, which means that the data has to go over a wire. Two ends of the cycle -- direct reads and asynchronous writes -- are happening at two different locations, and the software is infinitely portable. The software that glues those two phases together has to sit on both sides of the wire.
You have to be able to do the writes remotely and do the reads locally. You have to be able to divide the data. You have to be able to divide the devices. The communication between those devices, between network locations and a local location, has to happen across a software bridge. That means that the software lives on both sides of it.
Software that sits on your storage engine is worthless. It has to live on the Windows PC. It has to live on the front end of that environment, as well as the back end. It also has to organize that environment so that, in the front end, most of the data is coming from DRAM, because it's the fastest. It's a hell of a lot faster than SSD.