Close

See How IT Leaders Are Tackling AI Challenges and Opportunities

New research from CDW reveals insights from AI experts and IT leaders.

Jul 15 2025
Security

What To Know About Dark Data in Higher Education

How can higher learning institutions identify “dark data” and determine whether it is useful?

According to Gartner, dark data is “information assets organizations collect, process and store during regular business activities but generally fail to use for other purposes.”

In some cases, this data may be useful, yet a 2025 Splunk report, which surveyed multiple industries including the education sector, found that 55% of organizations said they have dark data. Dark data can also create “noise,” or turn out to be irrelevant or obsolete.

The challenge for IT departments in higher education is determining what part of this unexploited dark data is useful and what should be discarded.

Click the banner below to learn why data management is essential to AI success.

 

The Risks of Staying in the Dark About Valuable Data

Higher learning institutions continue to face unique strategic challenges, but it can be difficult for organizations to make decisions when they lack access to the data they need.

For example, enrollments are falling. Certain contingents of students (notably males) are opting to pursue less expensive “bootcamp” and certification programs that will spare them the costs of skyrocketing college tuitions, which have risen 169% between 1980 and 2020 alone.

Performing analytics on past student demographics and enrollment information that can only be found in dark data could assist administrators in making decisions, such as:

  • When does it make sense to offer financial aid to offset tuition costs?
  • Who (based on past student performance records) is most likely to succeed as a student?
  • How relevant is university curriculum, which can be evaluated by reviewing rising and falling enrollments by course and even instructor, and by also looking at the percentage of job placements for graduating students?

Equally important is a university’s donor network. Who is donating, and who was donating in the past? Are there certain characteristics that your donors share, such as being an alum or being in a certain profession or geographical area?

Contemporary records can answer some of these questions but not all. This is where dark data can be plumbed and mined for relevant information.

RELATED: The importance of data management in higher education.

The Contrarian Challenge of Too Much Data

IT departments must deliver mission-critical dark data to those who need it — but what about the stored dark data that might not be usable?

The problem begins with the many data silos that develop when different academic departments store their own data. An already difficult situation can get more complicated departments sign up for cloud services that central IT might not even know about.

There is also legal e-discovery, governance and compliance data that is accumulated and saved, often past the time boundaries required for this data to be maintained.

And then, there are the paper-based records that haven’t been digitized — and perhaps never will be.

As academic institutions continue on digital transformation journeys to automate business processes, outputs, computer network logs and more, these initiatives generate a flurry of new data that is often useless and unnecessary but nonetheless gets stored because everyone is afraid to delete it.

“Data is like garbage,” Mark Twain once observed. “You’d better know what you are going to do with it before you collect it.”

How To Act On Dark Data and Take Out the Trash

Dark data is a universal problem, so every organization must devise ways to work with it. Here are seven best practices:

  1. Perform a data audit. Data that is incoming and stored in IT should be evaluated for usefulness — and so should data in every department’s data silo and cloud storage repositories. Unless an organization knows how much data it technically has under management (although that data might not be getting managed), it won’t know how much data it has and how much data it is actually using.
  2. Evaluate data for usefulness. Once all known data and data sources are identified, evaluate them. Is the data being used, or is it stagnant data that is being maintained but is never used? Is the data of high quality: Has it been vetted, properly secured and stored, and normalized so it can work with other data?
  3. Categorize the data. Is the data value-added, obsolete or useless? Which data should be retained, and which should be jettisoned?
  4. Review all findings with management and department heads. Once data has been evaluated for usefulness, hold meetings with upper and middle managers to secure their approvals for deleting useless data.
  5. Develop an automated process for deleting useless data. Guidelines can be set for purges of data that becomes obsolete. If the legal requirement for storing financial data is seven years, an automated process can be installed that deletes any data exceeding those years. The same goes for compliance and legal discovery data. Once the data goes over its storage timeline requirements, it should be deleted.
  6. Don’t forget nondigital data. Many higher ed organizations still have paper-based files in physical storage that haven’t been digitized. What data needs to be retained, and what data doesn’t?
  7. Define and revisit data retention policies annually. Data safekeeping requirements change. Minimally, organizations should review data retention policies annually and make any needed
luza studios/Getty Images