Roughly 10 years ago, two major storage infrastructure companies merged. The smaller of the two companies, a storage manufacturer and a registered CLEC, acknowledged that one of the most painful customer complaints was the recurring cost of bandwidth. They recognized the fact that most customers wanted the ability to easily move data between sites, but couldn’t afford the fat pipes required to complete the task quickly. They capitalized on this by coupling the bandwidth with the infrastructure.
Fast forward to now. The bandwidth available to customers and the costs for bandwidth have remained relatively fixed. However, the amount of business and user data to be transferred has grown at staggering rates, and overall storage growth has grown by orders of magnitude. Just as it was 10 years ago, it is still cost-prohibitive for many customers to effectively protect their data sets in multiple geographies. We at SimpliVity realize that this is a fundamental tool that most IT organizations would like to have in their bag.
In order to cost-effectively move data between multiple sites, you have to shrink the data footprint on the wire. To do that completely you have to solve two distinct problems. The first problem, and the focus of this post, is the decoupling of the data on disk from the volume itself. The second problem, is how to stop sending data from one site to another that is already at the remote site. We will address the second problem in “Replication Part 2.”
We live in a virtualized world now. While not all customers are 100 percent virtualized, nearly 100 percent of the customers I meet with have a “virtualize first” policy. It’s only a matter of time until we become fully virtualized. One of our key design elements was to focus on the VM and how it interacts with the infrastructure. When looking closely at the VM, it becomes very clear that most storage systems available today are relatively ignorant of the virtual machine. They only care about a volume or a LUN. From the storage perspective, the world looks like LUN -> Datastore -> VM. But, the hypervisor sees VM -> Datastore -> LUN. Likewise the administrator wants to protect VMs and applications, not LUNs or volumes.
When we apply this ignorance to replication, we get an explosion of data that has no relevance to the application or VM that we want to protect. To better explain this phenomenon we can use an online retailer as an example. If the retailer functioned like a traditional storage vendor, each time you placed an order a truck would pull up in front of your house loaded with the entire warehouse. Clearly, this is both inefficient and ignorant of the actual task, which is delivering the product you actually ordered. The efficient way to satisfy the order is to ship only the product you ordered, not the ones you didn’t.
This is what OmniStack does. OmniStack allows us to apply data protection policies to the VM directly. By the way, this capability was one of the reasons SimpliVity won “Best of Show” at VMworld 2013. No longer do you have to manage the VMs location based on the constraints and parameters enforced by traditional storage arrays. This improved model eliminates all of the extra data transfer associated with other VMs in the same datastore (see illustration below). We have immediately cut down the data on the wire by a significant portion. SimpliVity OmniCube, powered by OmniStack, is truly unique with regard to VM-centric management coupled with data efficient and bandwidth efficient replication for DR.
Now we can move on to a much more complicated task, eliminating redundant data transfer.