The Data Science Hub will use the following strategies to advance fundamental and applied research, services, and outreach on the University of Wisconsin-Madison campus:

  • Build on existing successes from WID themes (especially the Optimization theme), the SILO (Systems, Information, Learning, and Optimization) seminar series and workshops, Core Computational Infrastructure, and collaboration success in the Biometry Consulting Facility and Biostatistics & Medical Informatics.
  • Organize data science activities around three complementary areas:
    › Mathematical foundations of data science: modeling, algorithms, optimization, machine learning, computational statistics.
    › Systems aspects of data science: database systems, data cleaning, data management, data integration, data visualization, computational technology.
    › Collaborations with domain research people across campus, including health sciences, energy, agriculture and environmental sciences, education and social sciences.
  • Collaborate with others at UW to develop and provide the underlying data science infrastructure/tools for UW scientists in an R1 research institution. The DSHub will foster development of stable software systems that make state-of-the-art data science tools and methods easy for practitioners to use. Such tools are critical to facilitate transition of our research into practice. The aim is to obtain external funding for one major center in at least one key aspect of data sciences. For example, a team of 14 has recently submitted a proposal to NSF’s new TRIPODS program for an “Institute for the Foundations of Data Science.” We will pursue other similar opportunities as appropriate.
  • Provide a forum to advertise the broad educational activities in data sciences across campus. Collaborate with other UW faculty and UW Departments to develop Data Science education resources for the campus community. Extend some data science courses from different departments, with the design/development of these courses utilizing the DSHub. Engaging in smaller group-defined teaching activities, such as the NSF NRT LUCID (, will provide additional mechanisms for education.
  • Training individuals who work with big data is a crucial process for success moving forward. The big data landscape is changing rapidly, requiring individuals to develop many competencies about tool use and ways to communicate ideas and results. People need training in how to work effectively in teams, using reproducible research principles to share emerging approaches. Project leaders need to learn how to build and evolve teams that adapt to changing needs. Big data often requires teams to learn how to maintain data confidentiality. Such training can be leveraged by research, teaching and outreach.
  • Develop campus-level consulting access to foster and help cross-disciplinary collaboration in research and teaching. It will leverage and build on successful models of the Biometry Consulting Facility and BMI-related facilities, including the Cancer ISR, CPCP and the Bioinformatics RC, to a more general campus facility serving all of campus.
  • Expand current industrial partnerships, such as the Optimization Research Consortium and the SILO Seminar sponsorship, to include a broader range of Data Science partnerships, and hold an annual Data Science Research Consortium Day at the WID.
  • Organize WID Public Lectures in Data Sciences, inviting high-profile external speakers including renowned researchers and senior figures in the major data companies.
  • Establish visitor programs in WID, including a visiting professorship in data science (usually to be held by a distinguished colleague on sabbatical) and one-year PhD student exchange programs with targeted institutions. These programs will promote new interactions and expertise beyond our group.
  • Facilitate joint graduate student recruiting in data science. Interested students enter through CS, ECE, Statistics, Mathematics, Information and other programs. We could arrange for all such student to visit on common dates, probably overlapping with the CS visit weekend, for discussions in WID with faculty and students in data sciences.