Technologies that collect data for analysis exist in pretty much every aspect of life — from smart thermometers in homes to wireless sensor networks.
Every industry is seeking ways to collect data in order to discover actionable insights, and demands are increasing in the workforce for people who know how to deal with this data. But, how do higher education institutions prepare the next generation of workers to take on that challenge?
The National Academies of Sciences, Engineering and Medicine sought to answer that question in its new report, “Training Students to Extract Value from Big Data: Summary of a Workshop.”
“Advances in technology have made it easier to assemble and access large amounts of data,” reads the report. “Now, a key challenge is to develop the experts needed to draw reliable inferences from all that information.”
The report was created from a NASEM workshop where several thought leaders shared their ideas around training tomorrow’s data scientists. We’ve extracted four key takeaways:
1. Data Science Requires a Blend of Technical Skills
Guy Lebanon, director of AI and machine learning at Amazon, said students need to have skills in software engineering, machine learning and product sense to effectively analyze data, according to the report.
With software engineering, Lebanon noted that data scientists can build out tools and tests and then use machine learning for optimization. Using product sense, the data scientists will then be able to establish an evaluation process using data to see if they are meeting business or organization goals.
At Southern Connecticut State University, students got hands-on experience using machine learning and product sense through an internship program. Students in the program used IBM Watson Analytics, a data analysis tool that uses elements of machine learning, to help a local business make better decisions.
2. Knowledge of the Data Exploration Process Boosts Critical Thinking
While students who want to work in data science need to know the technology, they will also need to harness critical thinking during the data analysis and exploration process.
Duncan Temple Lang, director of the Data Science Initiative at the University of California, Davis, noted in the workshop that prospective data scientists need to be skilled in “knowledge of randomness and uncertainty, statistical methods, programming and technology.”
Lang also outlined the 10 basic tenets of the data analysis and exploration process in the report:
Ask a question
Refine the question by identifying data
Transform data structures for analysis
Begin analysis and determine if results will scale
Reduce the dimensions of the data collected
Model and estimate data
Diagnose how well the model fits the data
Quantify uncertainty in results
3. College Data Science Curriculum Should Be Interdisciplinary
Students in many different disciplines could end up working in data science, but, Peter Fox, an earth and environmental science professor at Rensselaer Polytechnic Institute, noted in the workshop that these students aren’t always taught about the technology of data collection.
Fox championed that data science programs must be interdisciplinary from the start — teaching technical skills such as programming with critical thinking skills. For scientists, Fox noted that data science should be “a skill in the same vein as laboratory skills,” according to the report.
4. Team Projects Foster Creativity
In addition to interdisciplinary work, the report noted that it is also important for students to work with peers from a variety of backgrounds.
Using team projects to have students from different programs collaborate on a data problem allows for more creative thinking and more possibility for innovation. Also, if they use real-world data, the report noted that these students will gain valuable skills for the workplace.