F# Promises a Big Return on Big Data
If the future of everything lies in big data, the future of widespread, practical deployment of big data most likely lies in the F# programming language.
Evelyne Viegas, director of semantic computing at Microsoft Research, says the strongly typed, functional language — pronounced “F sharp” — enables rapid prototyping and the simultaneous exploration of web-based and local data in one environment. Last week, Microsoft Research unveiled a new web-based platform, Try F#, to further the language’s use and open development.
“We are bathing in big data,” Viegas says. “People are talking about the tsunami of the data, the world of information out there. What F# 3.0 is really about is bringing the schema, the semantics of the data, to the fingertips of the programmers and the researchers.”
Developed in 2010 by Don Syme, principal researcher at Microsoft Research, Cambridge, F# is used for a wide range of software applications in quantitative finance, biotechnology, insurance, fraud detection, power control, data analytics, forensic software, online advertising, trading systems and quantum computing. It is most commonly used for analytical software components embedded in larger systems with user interfaces in C# or Javascript.
Viegas says the latest Try F# platform will encourage innovation in the big data realm by bridging the gap between the developer community and “the people who are really, really knowledgeable about the data. It’s really about democratizing F# and bringing it to a broader reach of people, not just the developers, but people who are technology-savvy, who have some background in programming, who overall are really trying to solve real-world problems.”
Research incorporating the new Try F# platform is already under way at Rensselaer Polytechnic Institute in Troy, N.Y., where Professor James Hendler, an open-data proponent and an originator of the Semantic Web, is guiding undergraduate students through projects applying F# in data integration and international data sets using giant sets of open government data.
The project has several goals, Hendler says, not the least of which is encouraging cooperation among young developers. The new Try F# platform allows users to postback and share code, which Hendler says is “a very important part of teaching students how to work in teams and how to develop software together.”
“I want the students to learn that in modern programming, particularly the web environment, this kind of sharing and cooperation is very powerful and useful. So, being able to use libraries, to create libraries, is something I’m very big on. Supporting the community aspect is important.”
Students also need to learn how to interact with data, which is everywhere. And because F# is specifically designed to allow users to work more efficiently with data on a local computer as well as open, web-based data, Hendler says he is excited about its implications.
“The libraries that F# comes with give you some access to things that you’d have to write yourself in other environments,” he says. “It’s very useful when you want to bring data from different places together, because instead of trying to create a complex database back end, where you import lots of different things, you can actually reach out to data on the web and pull it together dynamically, build demos and things through the many visualizers and other tools provided through the system.”
Try F# also means that students — who like to work in different ways — aren’t required to bring whole computer environments to class.
“This allows them to develop the code how they want, and then share it through just a browser-based platform,” Hendler says. “It works through a number of different browsers and a number of different operating systems, so it’s much more open than many people expect.”
“With Try F#, what we really focused on is making it easy to learn,” Viegas says.
Microsoft Research worked with a community of researchers and developers to ensure the site included appropriate tutorials and features based on their feedback.
“We’ve tried to make it easy for people who just want to explore F#, just to learn the basic concepts of F#, up to more experienced developers,” Viegas says. “What is really neat about the language is that, with just a few lines of code, you can do a lot of rapid prototyping, rapid development, deployment — and now we’re bringing that to the world of big data.
“When you have data which is open out there, you’re going to have very smart people who are going to do all kinds of things with it,” Viegas says. “In my mind, it’s more like meeting the real world.”