Data science has transformed fields from biology to astronomy, and social networks to politics, influencing most aspects of modern life.
To transform data science, Stephen Wright, a professor of computer sciences at the University of Wisconsin–Madison, wants to return to the fundamentals.
Wright is leading a group of 14 researchers from the departments of mathematics, statistics and computer sciences in creating the interdisciplinary Institute for Foundations of Data Science. The new institute, housed at UW–Madison’s Wisconsin Institute for Discovery (WID), will play a key role in the future of data science, developing fundamental techniques for handling increasingly massive data sets in shorter times.
The IFDS is supported by a $1.5 million grant from the National Science Foundation’s Transdisciplinary Research in Principles of Data Science (TRIPODS) initiative.
The widespread applications of data science depend on fundamental mathematical and statistical tools for acquiring, handling and solving mountains of data. Only with such theoretical tools in place can meaning be extracted from such large data sets.
“Underlying all of what goes on in data science are these algorithms and formulations and models,” says Wright. “These theoretical foundations are what makes the data revolution possible.”
Recognizing the need to develop the foundations of data science, NSF is providing $17.7 million for 12 TRIPODS projects at 14 institutions, including UW–Madison. The awards are part of Harnessing the Data Revolution, the first of the agency’s 10 “big ideas” it deems essential for future investment.
The IFDS will play a key role in the future of data science, developing fundamental techniques for handling increasingly massive data sets in shorter times.
“Data is accelerating the pace of scientific discovery and innovation,” says Jim Kurose, NSF assistant director for Computer and Information Science and Engineering. “These new TRIPODS projects will help build the theoretical foundations of data science that will enable continued data-driven discovery and breakthroughs across all fields of science and engineering.”
The UW–Madison proposal is also one of three selected for co-funding from NSF’s new interdisciplinary Convergence program.
“Convergence is a deeper, more intentional approach to the integration of knowledge, techniques and expertise from multiple disciplines in order to address the most compelling scientific and societal challenges,” says France Córdova, director of NSF.
“WID is the ideal home for this project because it is an interdisciplinary proposal that brings together people in many different departments, but of course faculty and students in the departments will all be involved,” says Wright.The IFDS will be housed at WID as part of its new Data Science Hub, where it will contribute to the hub’s applied and systems-oriented components.
WID’s home in the Discovery Building provides facilities for the workshops, distinguished lecture series, seminars and collaborative research that make up the IFDS project.
The research themes of the new institute will be algebra and optimization in data science (led by electrical and computer engineering Professor Robert Nowak), graphs and networks in data science (led by statistics Professor Michael Newton and mathematics Professor Sébastien Roch), and data acquisition theory and methods (led by electrical and computer engineering Professor Rebecca Willett).
“WID is the ideal home for this project because it is an interdisciplinary proposal that brings together people in many different departments …”
The institute will not be restricted to the 14 UW–Madison faculty involved in the proposal. “They will be the nucleus, but we will be collaborating with others around the UW campus and in other institutions as well,” says Wright. Included in the institute’s charge is a plan to enrich graduate programs in data science and foster outreach to industrial partners with interests in fundamental data science research.
The IFDS has the opportunity to evolve into a larger, Phase II institute supported by NSF, a goal foremost in Wright’s mind as he and his colleagues rally around research activity that they see as critical to the future of data science.
“You can’t have developments in data science without having continuing attention to the fundamentals,” says Wright. “There is a lot of excitement about the algorithmic space, the theoretical space and the mathematical underpinnings, and that’s going to drive a lot of the future of data science.”