May 14 2012

When Deduping Primary Data Makes Sense

The University of North Carolina dedupes primary storage to manage the its growing data demands.

IT departments tend to shy away from deduplicating primary storage because of its potential performance impact. They fear the trade-off between the time it takes to identify and dedupe production data and the eventual gains in bandwidth and disk capacity.

However, the University of North Carolina at Chapel Hill (UNC) says today’s relentless storage demands are pushing aside any wariness of the technology and its possible productivity slowdowns.

“We have seen a proliferation of storage utilization by most applications as well as backup, so if we can reduce the number of blocks that have to be stored and backed up, we’ll save money,” says Michael Barker, assistant vice chancellor for infrastructure and operations and chief technology officer at UNC.

Greg Schulz, founder and senior adviser to the Server and StorageIO Group, says data that is not time-sensitive can be a candidate for primary storage deduplication. For instance, virtual desktop infrastructure, registration information and classroom records can take up space and present an opportunity to trade some data reduction time for storage space.

“Instead of waiting till data is sent to backup systems to be reduced, higher education institutions can perform that task on the source side, realizing immediate space-saving benefits,” Schulz says.

UNC uses NetApp’s FAS3270 appliance to eliminate duplicate blocks within primary storage. The savings goal is to reclaim 20 percent of capacity through data deduplication. The university also uses thin provisioning and flexible cloning techniques to drive further efficiencies. As it adds more applications to the deduplication pool, the university anticipates reaching a 50 percent return on usable space. “We expect our 280-terabyte spinning file system to present itself to our 27,000 students, faculty and staff as 400TB of available space,” Barker says.