Aug 15 2022

Data Warehouses vs. Data Lakes: How Can K–12 Schools Store Data?

Choosing the right data management solution comes with a long list of considerations for district IT teams, including where to start.

Every school district has data storage needs, and student records are just the beginning. IT systems demand storage as well: Internet logs, security events, building systems, security cameras and more all require storage.

No matter the storage option, having storage is the easy part of the data management problem. The challenge is making use of the data once you have it. In addition to storing data for school districts, data warehouses and data lakes help make data useful.

There are a few key differences between data warehouses and data lakes. Here’s what K–12 IT leaders should consider when looking at how they might address their schools’ data storage needs.

What Are Data Warehouses, and How Can Districts Use Them?

A data warehouse may sound like another name for a database. In a certain sense, that’s true. Both databases and data warehouses store data that is queried in response to a search. However, databases are unitaskers, while data warehouses can answer complex queries using data from many sources.

A data warehouse combines data from several data sets, focusing on the specific information a school is interested in. Queries are written to extract the data from the warehouse’s customized data set. School staff can run those queries against the data warehouse to generate analytics reports.

For example, if all you want to know is Sue Smith’s grade on her earth science report last quarter, a database will suffice. If you want to graph the trends of all earth science students’ course grades over the past decade, correlated with curriculum changes, teachers and resident neighborhood, a data warehouse would be a more suitable solution.

What Are Data Lakes, and How Can Districts Use Them?

Data warehouses map data into a predefined structure before it’s quarriable, but data lakes are more flexible. A data lake collects all types of data without imposing a structure until the query is taking place.

This approach to data structure makes a data lake ideal for housing vast amounts of data on cheap storage. The tradeoff is in accessing the data. Queriers need to know how to access the specific bits of information they are seeking. This can put data lakes in the realm of data scientists, specialists who study data, look for trends or train machine learning models.

If your district doesn’t have a data scientist on staff, don’t fear. Software can ease the complexity of querying multiple, disparate data sets. If you need to analyze massive amounts of data from many sources, a data lake might be appropriate for your school.

Data Warehouses and Data Lakes to Consider

There are many products in the data warehouse and data lake space. Here are a couple to consider as you begin evaluating the broader market.

Public cloud provider Microsoft Azure offers several services that combine to form a data warehouse with rich analytics. Azure Synapse Analytics is the key offering here from Microsoft, connecting to several data sources, normalizing the data and then running queries. Effort is required to create a functioning solution, though. Microsoft offers the platform, but you’ll need to work with IT experts to build on it.

IT security provider Palo Alto Networks offers the Cortex Data Lake. Cortex is focused on IT security data. A school district would find Cortex useful for aggregating security events across the district into one data lake. Cortex uses artificial intelligence to analyze the data and uncover important security trends.

Each of these options is a platform upon which you can build your own solution, custom-tailored for your school. These platforms will require district investment before they can provide the desired insights.

Other data warehouses or lakes are use case-specific. Use case-specific solutions that address a particular data challenge can provide value more quickly.

Should Your School District Use a Data Warehouse or Data Lake?

Maybe you’re not sure whether you should be shopping for a data warehouse or a data lake. Consider that data warehouses and data lakes are not mutually exclusive. Data warehouses can use data lakes as sources, working together to help you mine gems in your data you might not have known existed.

That’s the key takeaway: You’re building a data management solution, not merely selecting a data storage option. The solution will be built from all the IT components that allow you to uncover school insights.

