Jim Duran, Director of the Vanderbilt Television News Archive, Vanderbilt University

May 21 2024
Cloud

Universities Harness the Cloud for Digital Archives

Moving archival footage and documents off premises helps institutions protect their investments.

Jim Duran came to his job knowing there would be challenges ahead.

It was February 2018, and he’d just been appointed director of the Vanderbilt Television News Archive, one of the world’s largest collections of news broadcasts from national networks such as ABC, CBS and NBC. The archive had been growing quickly and steadily since the day it was created in 1968.

Until the early 2000s, Vanderbilt University Libraries, which manages the VTNA, had turned to its own shelves to store the physical tapes used to record the broadcasts. That changed when the National Science Foundation awarded them a grant to digitize the collection.

“It was well over 30,000 tapes, and it took about five years to do it,” Duran says. The process entailed using video capture cards to convert analog signals from old playback machines into digital signals that a computer hard drive could record to disk.

When the project was finished, the digitized video archive amounted to about 150 terabytes of content. From there, the collection kept expanding as new digital video went straight to storage. By the time Duran joined the VTNA, it was somewhere around 170TB and counting.

“Our biggest issue was, we were running out of space,” he says. “We’d always relied on on-premises hardware and software, and it was obvious that was no longer sustainable.”

Click the banner below to optimize your university’s connection to the hybrid cloud.

 

The archive arrived at a solution after consulting Vanderbilt’s central IT department. The tech team had been using a number of cloud services to manage important workloads for university researchers and others, and suggested moving the archive offsite and into the Amazon Web Services cloud.

“It would make management of the collection a lot easier because everything would be available at the command line,” Duran explains. “And adding storage wouldn’t be a problem because, being in the cloud, we just pay for what we use.”

Within a few weeks, the archive had transferred the entire collection. Today, Duran says, new VTNA files are automatically processed in the cloud, and his team is exploring a suite of artificial intelligence–enabled services to provide enhanced access to the archive.

Two cloud tools, for example, automatically create written abstracts for all of the collection’s broadcasts. Another service deploys machine learning to identify and capture unprogrammed breaking news, while a tool called Amazon Transcribe accurately converts recorded speech to text.

“That has really paid off for us,” Duran says, explaining how the transcription functionality has allowed the archive to add closed captioning to video. “And the best part is, with all of these things, we don’t have to purchase additional hardware. We budget for the space we need, but the infrastructure is taken care of.”

RELATED: Tips for getting your higher education infrastructure AI-ready.

Cloud Storage Makes Digital Archives More Accessible

The VTNA has plenty of company when it comes to universities that are harnessing the cloud for their digital archives.

“It used to be that everything was on-prem just because that’s how it was done,” says Wayne Graham, CIO at the Council on Library and Information Resources. Universities have traditionally looked to their IT departments and asked them to create the programs and build the infrastructure required for their digital archives, he explains. “But then, what happens when your dedicated developers move on to other institutions? You lose knowledge of your code base, technologies change, and now you have this bucket of stuff that you may not have the resources to maintain.”

The cloud can be a solution for archive directors who, like Duran, may have concerns about sustainability. “It’s one of those things you have to weigh,” Graham says. “Maybe it’s more cost effective, or maybe it’s going to be easier to set up. Are there opportunities that the cloud can provide that you don’t have on-prem?”

It was the promise of the cloud that led Rice University’s Center for Research Computing (CRC) to rethink its hosting strategy for the institution’s SlaveVoyages archive. The university has turned to Oracle Cloud to host the world’s largest slave trade database.

The collaborative digital project compiles and makes publicly accessible records of the transatlantic slave trade between the 16th and 19th centuries.

Image of John Mulligan with quoted text

 

Since the initiative was launched in the 1960s, researchers from Rice and several other institutions have cataloged and consolidated these records into a single repository. At first, they kept their information as handwritten notes. Later, they moved to desktop computers, CD-ROMs and, finally, local servers. Before they moved to the cloud a few years ago, the collection was hosted using infrastructure at Emory University.

The driving factors for the change, according to John Mulligan, the CRC’s humanities computing researcher and facilitator at the time, “were almost as much institutional as they were technological.”

Digital archives with a web presence often struggle to maintain their functionality over time, Mulligan explains. With multiple universities and other organizations sharing responsibilities for the SlaveVoyages site, pivoting to the cloud resolved the pressing problem of archive portability.

“The cloud became really appealing because if you wanted to move it from one host institution to another, you couldn’t guarantee that everyone would have what they needed for servers,” Mulligan says. On the other hand, he notes, “everyone has the ability to pay a cloud vendor to host the database for us.”

Digitizing Archival Materials Helps Preserve and Protect Them

Cost played a part in the decision to leverage the cloud for some of the Digital Collections at the University of Delaware. Managed by UD Library, Museums and Press, the collections are divided between two platforms: an institutional repository called UDSpace that contains text-based documents and and a public archive called Artstor that contains visual materials such as photographs and maps.

At less than a terabyte of data, UDSpace is relatively small and is hosted on a server in the library’s basement that is maintained by the library’s IT team. The Artstor collection lives entirely in the cloud and is managed through subscription to a service from a nonprofit organization by the same name.

“We don’t necessarily have the staff or budget to handle hosting everything ourselves,” says UD’s Annie Johnson, associate university librarian for publishing, preservation, research and digital access.

UDSpace was established years ago, before cloud services became widely available, she explains. Because the library building has significant issues that stem from deferred maintenance, it’s become increasingly susceptible to leaks and occasional power failures. Keeping the bulk of its digital collections in the cloud means library IT staff have less to worry about should anything happen to the on-premises infrastructure.

WATCH: Reimagining library spaces for today’s college student.

The university’s Artstor archive, Johnson estimates, is up to 4TB and growing. Among the content in the collection: an assortment of 23 color lithographic prints of Civil War encampments and more than 2,000 postcards depicting local sites and attractions.

Johnson notes that her team “is always digitizing,” taking special collections materials from the library and either scanning or photographing individual items before adding metadata to the images.

“Our main goal is to make our special collections more accessible for researchers and the public,” she says. “That, and ensuring it’s in a format that hopefully can be preserved for a very long time.”

Photography by William DeShazer
Close

Learn from Your Peers

What can you glean about security from other IT pros? Check out new CDW research and insight from our experts.