Close

New Workspace Modernization Research from CDW

See how IT leaders are tackling workspace modernization opportunities and challenges.

Mar 06 2026
Data Analytics

Shadow Data in Higher Education: Governing Unsanctioned Data Before It Becomes a FERPA Problem

As data moves beyond institutional systems, higher education faces a growing challenge with shadow data. Here’s how IT leaders can identify, manage and govern it before it creates compliance risks.

In higher education, data is no longer confined to institutional systems. It moves across departments, devices and cloud platforms, often without visibility from IT. The result is a growing challenge: shadow data.

Unlike shadow IT, which involves unsanctioned applications, shadow data refers to institutional data that is captured, stored or shared outside approved systems. As colleges and universities expand their use of analytics, cloud services and AI, this hidden data layer introduces new risks around student privacy and compliance.

According to Kathe Pelletier, senior director of community programs at EDUCAUSE, the stakes are rising. “Effective data governance in higher education is becoming increasingly critical as rapid technological change, evolving institutional roles and expanding risk exposures reshape the data landscape,” she says.

What Is Shadow Data?

Shadow data emerges when faculty, staff or students create their own ways of storing and analyzing information, from downloading student records into spreadsheets to storing research data on personal drives or in unsanctioned cloud tools.

Click the banner below to learn more about the benefits of effective data governance.

 

These work-arounds are usually driven by practical needs, such as speed and flexibility. But they also point to a disconnect between how institutional systems are designed and how people actually use data. When governance frameworks are unclear, inconsistent or difficult to follow, users are more likely to find their own solutions.

That disconnect is a growing concern. The “2025 EDUCAUSE Horizon Report: Data and Analytics Edition” describes a shift toward unified data models and integrated data ecosystems, yet notes that many institutions still lack the governance structures and data literacy needed to support that vision. This creates an environment in which shadow data can proliferate.

As data moves outside governed systems, institutions lose visibility and control, along with the ability to answer critical questions like, where is sensitive data stored? Who has access? How is it protected? Without clear answers, institutions increase their exposure to security incidents and compliance risks.

Where Shadow Data Lives

In higher education environments, shadow data often accumulates in familiar places:

  • Personal laptops and external drives
  • Departmental shared folders
  • Unsanctioned cloud storage platforms
  • Research data sets stored outside institutional repositories
  • Student data exports used for analysis or reporting

Faculty may download class rosters to manage grades offline. Researchers may store data locally for faster processing. Administrative staff may export student information to spreadsheets for streamlined reporting. While these shortcuts can improve productivity in the moment, they fragment the institution’s data ecosystem.

That behavior has broader consequences. “Data is most valuable when access is coordinated, shared and supported by unified systems rather than fragmented across siloed units,” Pelletier says.

But shadow data does the opposite. It creates silos that weaken data quality, limit collaboration and increase risk.

From Shadow IT to Shadow Data: Why the Problem Has Evolved

These examples point to a broader shift. Higher education institutions have long dealt with shadow IT or unsanctioned apps and systems used by faculty and staff. But the problem has evolved.

LEARN MORE: Data governance is considered a human issue.

Today, even within approved tools, users can extract, copy and share data in ways that bypass governance. A data set may start in a secure system but quickly move into spreadsheets, personal devices or external platforms, creating new layers of risk.

The rise of cloud services, application programming interfaces and AI tools has accelerated this dynamic. Data is easier than ever to move and duplicate. While that flexibility supports innovation, it also increases the likelihood of data sprawl.

Pelletier says growing demand for access is a key driver. “With the accelerating pace of AI innovations and growing data needs — such as staff and student access to APIs for workflow automation — institutions must establish clear and consistent approaches to organizing and accessing their data,” she says.

In other words, the challenge is no longer just controlling systems. It is managing how data flows across them.

The FERPA Problem: How Unsanctioned Data Creates Compliance Exposure

The Family Educational Rights and Privacy Act governs how institutions handle student education records, requiring that personally identifiable information be protected and shared only under specific conditions.

FERPA compliance becomes harder to meet when data moves beyond approved systems. When faculty or staff download student information into spreadsheets or store it on personal or unsanctioned platforms, it may no longer be protected by institutional safeguards, such as encryption.

WATCH: Four AI trends to monitor this year.

This is where shadow data becomes a compliance risk. A spreadsheet containing student records stored on an unsecured device or shared improperly can expose sensitive information and potentially violate FERPA.

Just as important, responsibility does not shift when data leaves institutional systems. Under FERPA, institutions remain accountable for safeguarding education records and controlling access to student data, even when it is stored or processed outside core environments.

That is why governance is critical. “Mature data governance and management practices are essential,” Pelletier says. “They safeguard data quality, security and compliance across the institution.” Without those practices, institutions face “greater security and privacy risks, inconsistent data definitions and misguided planning.”

Kathe Pelletier
Data is most valuable when access is coordinated, shared and supported by unified systems rather than fragmented across siloed units.”

Kathe Pelletier Senior Director of Community Programs, EDUCAUSE

Tools and Techniques for Finding Shadow Data

The first challenge with shadow data is visibility. You cannot govern what you cannot see. So, IT leaders are increasingly turning to tools that can help identify where sensitive data resides across the environment, including:

  • Data discovery and classification tools
  • Data loss prevention solutions
  • Cloud access security brokers
  • Endpoint detection and response platforms

UP NEXT: Observability contributes to a strong data strategy.

Security providers such as Palo Alto Networks and Cisco offer tools designed to help institutions find and track sensitive data.

These tools can scan files for information, such as student records; show who is accessing that data; and flag unusual activity, such as large downloads or unauthorized sharing. For example, Palo Alto Networks explains that data loss prevention tools can identify where regulated data is stored and how it is being used, helping organizations reduce risk. 

But while technology can unearth shadow data, that’s just part of the solution. Governance determines how it should be managed.

Building Data Governance Policies That Actually Get Followed

Many institutions have data governance policies, but not all of them are effective. For governance to work, it must align with how people actually use data.

Pelletier points to several priorities for building sustainable governance frameworks:

  • Establishing clear policies for AI and data use
  • Defining data ownership and stewardship
  • Addressing ethical and privacy considerations
  • Managing both structured and unstructured data
  • Investing in centralized data leadership and cross-functional teams

These efforts help create a governance framework that can scale with institutional needs.

CAMPUS CONNECTION: Minimum viable data governance is key for higher education.

“Strategic investments in cloud infrastructure, data integration and cross-functional analytics teams” are also essential for building scalable data ecosystems, she says.

Usability is equally important. If governance policies are too restrictive, users will find work-arounds, which only create more shadow data. Effective governance balances security with accessibility, enabling users to do their work while maintaining oversight.

User Education: Changing Behavior Without Blocking Legitimate Workflows

Ultimately, shadow data is as much a human issue as a technical one.

Faculty, researchers and staff are not intentionally creating risk. They are trying to solve problems quickly and efficiently. Governance strategies must recognize this reality.

Education plays a central role. Institutions need to help users understand what constitutes sensitive data, why certain storage and sharing practices are risky, how to use approved tools effectively and where to go for support.

Click the banner below to subscribe to our weekly newsletter.

 

At the same time, institutions must provide tools that meet users’ needs. If approved systems are difficult to use or lack necessary functionality, shadow data will continue to grow.

When done well, Pelletier says, governance “enhances institutional trust, strengthens accountability and enables progress on priorities, such as student success and long-term competitiveness.”

Bringing Shadow Data Into the Light

Shadow data is not a new problem, but it is becoming more urgent as higher education institutions expand their digital ecosystems.

Without governance, data becomes fragmented, insecure and difficult to use. With it, institutions can unlock the full value of their data while protecting privacy and maintaining compliance.

The goal is not to eliminate flexibility. It’s to bring visibility and structure to how data is used — because in an era defined by data, what institutions don’t see can hurt them.

Drazen Zigic / Getty Images