How Will Higher Ed Meet the Demand for Data Scientists?
Data, more than ever before, is the lifeblood of every organization. From media companies to retail stores, data allows organizations to differentiate themselves from the competition. Whether used in market research or in cost reduction efforts, organizations must leverage data to be competitive, and for the most part, they now enjoy access to all the data anyone could ever need.
What most organizations today don't have ready access to are data scientists who are trained to turn all of that data into actionable insights. Gartner predicts that by 2015, about 4.4 million data science jobs will be available, and only a third of them will be filled. Is there a shortage of people with the requisite combination of programming, data management skills, and statistical and mathematical ability? Yes. Can we in academia develop more of those people? Absolutely. But we are going to need the help of industry to accomplish that.
As universities nationwide define and create new undergraduate and graduate data science programs, industry can help to make our programs more interesting and relevant for students. We also will benefit from greater access to industry data sets and the latest Big Data technologies.
Data science, by its nature, is abstract, making it difficult to attract initial student interest. But in the real world, data science is applied in many concrete, exciting ways that we can bring into our classrooms.
The perceived gap between academia and industry is why I took a sabbatical this past year to work with Avalon Consulting LLC, a Big Data consultancy headquartered near Dallas. The experience has proved invaluable; however, most, if not all, of what I learned came through hands-on experience in developing solutions. To make data science concepts more interesting and tangible for students, we should provide them with similar hands-on opportunities. To that end, companies should organize contests for students that allow them to work on actual problems with real industry data sets (similar to the Netflix Prize). My university offers students a capstone project that challenges them to solve real-world problems provided through organizations such as Avalon and the U.S. Armed Forces. Such partnerships are a good start, but to educate data scientists, we need to do more.
Work Remains
Academia needs to stay up to date with the latest Big Data tool chains. These ecosystems are evolving rapidly, but most of the development happens on the industry side — so much so that academia often finds itself left behind. If we want our students to be aware of the latest changes when they graduate, we must foster a better exchange of technological knowledge between academia and industry.
Industry typically uses summer internships to expose students to the latest technologies. Such experiences could be extended to faculty as well. My colleagues at Rose-Hulman used industry experiences in the past to stay current on technologies such as Google web frameworks, Android development and other programming languages.
Longer-term experiences, such as my training sabbatical, should also be considered. My experience led directly to development of new courses in Hadoop and modern database paradigms. The time I spent working in the ever-changing world of Hadoop and NoSQL databases at Avalon was critical to those courses' development. Given the rapid pace at which Big Data technologies are now advancing, it is imperative that academia and industry find additional ways to collaborate to better prepare our students for the workplace.
Once our students enjoy greater access to industry data sets and the latest Big Data technologies, we can bring more real-world problems, solutions and stories into the classroom and generate greater interest in data science. The future for these students is bright, but industry collaboration is required to fully meet its demands.