What Are Data Warehouses, and How Are They Used?
Data warehouse might sound like another name for a database, and in a certain sense, it is. Both data warehouses and databases store data that is queried in response to a search. However, databases are unitaskers, while data warehouses can answer complex queries using data from many sources.
A data warehouse combines data from several data sets, focusing on the specific information a college or university is interested in. Queries are written to extract the data from the warehouse’s customized data set. University IT staff can run those queries against the data warehouse to generate analytics reports.
For example, if you only want to know Sue Smith’s grade on her biology final last semester, a database will suffice. However, if you want to graph trends in all biology students’ course grades over the past decade, correlated with curriculum changes and student majors, a data warehouse might be a more suitable solution.
DIVE DEEPER: 5 things universities need to know about software-defined data centers.
What Are Data Lakes, and How Are They Used?
Data warehouses map data into a predefined structure before it can be queried, but data lakes are more flexible. A data lake collects all types of data without imposing a structure until the query is taking place.
This approach to data structure makes a data lake ideal for housing vast amounts of data on cheap storage. The trade-off is in data access: Queriers need to know how to access the specific bits of information they are seeking. This puts data lakes in the realm of data scientists, specialists who study data, look for trends or train machine learning models.
Software can also ease the complexity of querying multiple, disparate data sets. If you need to analyze massive amounts of data from many sources, a data lake might be appropriate for your university.