Feb 10 2014

5 Tips for Data Deduplication in VDI

Master storage in a virtualized environment with these data dedupe tips.

Combining VDI and data deduplication to master storage seems like a match made in heaven: Both technologies reduce the amount of storage space required for data, and together, they can ­reduce system storage requirements by 95 percent or more.

Here are five tips to deploy data deduplication most effectively in a virtualized environment:

1. Manage available physical storage.

Beneath all those layers of virtualization lies real physical storage. Be sure to consider available physical storage before adding virtualized data, and set capacity alarms accordingly. That can be tricky when deduping data.

A 100-gigabyte array can easily hold 200GB of data, but not until it is deduplicated. It is necessary to keep enough space on the system to store the data while it is being deduplicated and, often, to deduplicate the data in chunks that will fit into the available space.

2. Set up separate ­computers to handle storage and compute jobs.

When designing the storage ­system, remember that deduplication requires a lot of computing horsepower. Trying to do it on a machine that is handling computing responsibilities will result in an unacceptable loss of performance.

Generally, it's not necessary to change infrastructure all that much. An exception: All virtual desktop files must be stored on a file server running a compatible OS, such as Windows Server 2012 R2 Preview on a Windows system.

3. Incorporate solid-state disks.

One bonus with VDI deduplication storage is that it makes many files small enough that they can be stored on solid-state disks, which produce a considerable increase in performance. SSDs are as much as 10 times faster on reads and about four times faster on writes than hard disks. They are also more compact and consume less power.

4. Shrink volumes to fit.

There are some disadvantages when it comes to SSDs: They are as much as 10 times more expensive per gigabyte stored, and their speed difference doesn't make them a comfortable match with hard disks without some storage reconfiguration. In VDI deduplication, however, it is possible to shrink volumes down to where they can fit cost effectively on SSDs. Because the size of the storage is greatly reduced, SSDs' cost becomes almost equal to, and sometimes less than, the cost of hard-disk storage for uncompressed data.

5. Allow for performance differences.

While a combined VDI–SSD–data dedupe system will outperform conventional hard-disk storage, it won't run as quickly as the theoretical maximum. That's because as data is read, it must be "rehydrated," so to speak, or reassembled from the deduplicated and virtualized data that is actually in storage.

The process typically exacts a performance penalty of about 10 percent, which usually isn't a problem, but is something that administrators should keep in mind.