Most large enterprise environments employ extremely expensive storage arrays or even full data centers in order to back up their files. But for smaller or even midsize colleges and universities, going that route proves both prohibitively expensive and inefficient. The data sets that higher education typically collects are extremely important, yet often smaller than what a typical Fortune 500 company might amass. The EMC Data Domain DD2200 system was designed with such midsize needs in mind.
Sized to Fit
The DD2200 offers many of the same backup features found in the largest data centers, but shrinks everything else into an easy-to-install, rack-mountable, network-attached storage (NAS) device.
The DD2200 we tested consisted of a drive array with 12 bays. Seven of those drives were filled with 2-terabyte hard drives, giving a total capacity of 14TB overall and plenty of room to expand later.
NAS devices, once constrained to the personal storage realm, have expanded and evolved through the years, and the DD2200 follows that trend. Setup is as simple as attaching a Gigabit Ethernet cable and selecting desired backup options. Our team configured the RAID level we wanted and assigned backup tasks from a hosted test network in less than an hour. Setting up the DD2200 as an offsite backup at a remote location — another great use case — does not require any more setup time.
Big Data in a Small Space
In addition to having a whopping 14 terabytes of data storage capacity out of the box, the EMC Data Domain DD2200 system makes use of EMC’s deduplication technology to shrink data down into smaller chunks. That has the potential to expand available space even further. But not all deduplication is equal, so we ran some tests to see just how much the DD2200 could really shrink things.
Deduplication is important because most users don’t use it in their network-attached storage in RAID 0 configuration, instead spreading data out evenly across the entire array of disks. While that improves performance, the failure of one disk kills all the data. Most users will set backup to RAID 5 so that one disk can be lost with no impact to the stored files, though some may go for RAID 1 and true mirroring. (That’s probably overkill on a unit like the DD2200 with seven disks.) Regardless, setting a RAID level other than zero will result in a storage capacity loss right from the start. Deduplication can alleviate that somewhat.
For our testing, we used a RAID 5 configuration, which resulted in a loss of almost 2TB from the total capacity. We sent various files over to the DD2200, with a total capacity of 1TB. The files consisted of emails, which generally can be highly compressed with deduplication because of all the redundant header information, files from databases and spreadsheets (about average in compressibility) and even some graphics files that aren’t compressed much at all. We varied the ratio of those file types to get an average compression rate for a random sampling of data that might be found in any higher education organization.
In the first test, the DD2200 was able to compress 1TB into 752 gigabytes of information, a reduction of almost 25 percent. In the second test, which contained more graphics files, the reduction was still down to 908GB, almost a 10 percent savings. The DD2200 did best when email, documents and spreadsheets made up the bulk of the backup data, getting the resulting size down to 701GB — just shy of an incredible 30 percent reduction.
So it’s safe to say that the maximum reduction in file size is going to be about 30 percent, with the minimum at about 10 percent. In most cases, that’s enough to compensate for a RAID 5 configuration, especially if most of the backup files are text-based in some way. That’s really good news for deduplication technology and can ensure that those 14TB of storage in the DD2200 come pretty close to the mark, even if additional protections like a RAID level with redundancy are added.
The DD2200 uses extremely fast EMC deduplication technology. After the initial data dump, which sent every file over the wire to the device, subsequent incremental backups took less time than we’ve seen in other deduplication technology. The DD2200 compresses the backup data so that it requires less space, further stretching out the already impressive 14TB capacity.
Should something go wrong, admins can pull out a disk for replacement without powering down the unit or halting network activity. Depending on the RAID level, that ensures complete redundancy, even in the event of a total drive failure, giving users emergency backup for their emergency backup device. The DD2200 also features in-line write and read verification, continuous fault detection and self- healing, just in case.
An EMC Data Domain DD2200 system can provide higher education with the safe, advanced backup tool they need without the complexity of a 300-level class.