By offering deduplication and backup functionality as core functionality in the SimpliVity platform we gain many advantages for saving disk capacity, eliminating unnecessary disk IO and improving data movement across WAN connections. Many people incorrectly think that our backup capabilities are similar to backup-like functionality offered in traditional storage arrays. Our method for performing backup provides better data integrity, faster backup and restore, and better utilization of bandwidth across WAN connections.
Oftentimes, a storage array will provide backup-like functionality by taking a LUN/share-level snapshot and replicating that snapshot backup offsite. SimpliVity backups do not use snapshot technology. Instead, our backup capability is rooted in our underlying Data Virtualization Platform, which deduplicates, compresses and optimizes all data once and forever across all phases of the data lifecycle, including all local and remote backup data.
When all data is deduplicated and VMDKs are represented by metadata, creating a point-in-time backup is a simple matter of creating a copy of the metadata. This creates a more efficient storage layer since no read or write IOPS need to occur, and allows for the creation of a full and independent backup in seconds.
To migrate backups to a remote data center or the AWS cloud, we don’t replicate data like a traditional array would, where every changed block is moved across the replication link. The data that a SimpliVity Federation moves is based on comparing the metadata of the sending node to the metadata of the receiving node. Only blocks that the receiving node doesn’t already have stored are sent.
Let’s start with the analogy of a deck of cards (thanks to SimpliVity Solution Architect Damien Bowersock for this analogy). I have a deck with 52 cards that represents a VM. I send you the manifest of cards and you reply back that you have all but the 9 of Hearts and Ace of Spades. I now only have to send you two cards and you can create a map that includes the 50 cards you already have, plus the two I sent you.
If one of the 52 cards were corrupted between two backups without the knowledge of the metadata layer, it wouldn’t be replicated or moved to the secondary site at all. Of course, if the virtual OS wrote corrupted data, the primary site would detect that as a new block of data that couldn’t be deduplicated and would then compress, optimize and write it to the HDDs, along with updating the metadata. This would then end up as one of the cards that would need to be moved to the remote data center. In this case, the corrupted block would be a new block and wouldn’t affect the good block that the previous backup(s) are referencing.
Array-based “backups” that are based on snapshots are oftentimes dependent on previous snapshots. This chain of consistency must be tracked and managed by the system to ensure that backup data can be restored properly. A SimpliVity backup is a fully independent point-in-time backup with no dependencies, other than the metadata and the actual blocks on the disks. Both are protected across nodes and locally for multiple disk failures.
This is how we differentiate from SAN-based “snap and replicate backups.” All of this results in more efficient movement of backups, more reliable restores, and less impact on the production environment.