A bi-monthly newsletter with updates on data and computing news and events for UW-Madison researchers.

In the Jan. 23, 2019 update:

Are you…

  • unsure which campus data and computing resources you need for your research?
  • interested in making connections and starting new collaborations with data scientists and other researchers on campus?
  • looking for training in data and computing skills?

The Data Science Hub can help! Send an email to the Data Science facilitator (facilitator@datascience.wisc.edu) or come by Hub Central in the Discovery Building during office hours (W 9:30-11:30, Th 3:00-5:00pm). Tomorrow’s (Jan 24th) office hours will feature representatives from CHTC, RDS, SSCC and more!  Stop by to ask the experts your research computing and data science questions. Check calendar for latest details and updates.

Enabling Public-Private Collaboration with Semi-Synthetic Datasets by Bill Howe

On Thursday, February 7, Dr. Bill Howe will be giving the Weston Roundtable Lecture at 4:15 pm in
1163 Mechanical Engineering
. His talk is entitled, Beyond Open vs. Closed: Enabling Public-Private Collaboration with Semi-Synthetic Datasets. His team is developing an integrated legal-technical data collaborative designed to balance competing interests between the public and proprietary.  These datasets are intended to be shared with academic and private collaborators to experiment with advanced analytics without incurring significant legal risk, and to focus attention on pressing problems in housing, education, and mobility.

There is a special opportunity for graduate students to meet with Dr. Howe at 9:30 am on the same day in the Orchard View Room in the Discovery Building. Interested students should contact Whitney Sweeney (wasweeney@wisc.edu) to let her know that they would like to attend. A light breakfast will be provided.

Upcoming Campus Events (Calendar View)

Center for Demography of Health and Aging (CDHA) Training Seminar2:00pm-3:15pm, 8417 Sewell Social Science Building
Jan 23, Introducing the UW Survey Center, Nora Cate Schaeffer
Jan 30, Media Relations and Communications Strategies, Veronic Rueckert and Eric Hamilton
Feb 6, The Academic Job Market: What Do Hiring Committees Look For? Jason Fletcher and Christine Schwartz

Computation and Informatics in Biology and Medicine (CIBM) Seminar4:00pm, 1360 Biotechnology Center
Jan 22, Choosing Parameters for Useful Pathways in Biological Pathway Prediction, Chris Magnano and Human-Computer Interaction Challenges in Machine Learning Clinical Decision Support, Kendall Park
Jan 29, Sparse Recovery Techniques In Metagenomics, Simon Foucart, Texas A&M University (Auditorium)

Systems, Information, Learning, and Optimization (SILO) Seminar12:30pm, Orchard View Room, Discovery Building
Jan 23, Optimal Recovery under Approximability Models, with Applications, Simon Foucart, Texas A & M

Computer Science Seminar: 4:00pm-5:00pm 1240 Computer Sciences
Jan 24, What’s So Hard About Natural Language Understanding? Alan Ritter, Assistant Professor, Department of Computer Science, Ohio State University

Applied and Computational Mathematics Seminar (ACMS): 2:25pm, 901 Van Vleck Hall
Jan 25, Machine Teaching: Optimal Control of Machine LearningJerry Zhu
Feb 1, TBA, Chung-Nan Tzou

Biostatistics & Medical Informatics (BMI) Seminar: 12:00pm, Biotechnology Center Auditorium
Jan 25, Rapid Acceleration of the Permutation Test via Slow Random Walks in the Permutation GroupMoo Chung

Statistics Seminar4:00pm, 133 Service Memorial Institute (SMI)
Jan 25, Identifiability of Nonparametric Mixture Models, Clustering, and Semi-supervised Learning, Nikhyl Bryon Aragam, Carnegie Mellon University
Jan 28, Joint Analysis of H&E Stained Images and Genetic Covariates Using Convolutional Neural Networks and AJIVE, Ian Carmichael, University of North Carolina, Chapel Hill

The Wisconsin Association for Computing Machinery – Women in Computing, Feb 5, 12:15pm-1:15pm, Computer Sciences 2310

Upcoming Trainings and Workshops

Molecular Modeling and Drug Docking

This is a ‘hands on’ 8 week course on molecular modeling with emphasis on drug design that starts on February 4 and is held in the CALS computer lab Animal Sciences Building. It is open to any grad student or staff of the UW. This course will cover the basics of protein and small molecule modeling using the commercial software Sybyl from Tripos. Then several docking programs, such as SurFlex, DOCK, and Autodock4, will be examined with real examples form the literature. Each student will be running Sybyl and autodock on an iMAC in the CALs computer lab (calslab.cals.wisc.edu/). The cost of the course is $300, usually paid by the professor. There is NO UW credit for this course. The course has been taught since 2003 by Dr. Ken Satyshur, from the SMSF who has more than 30 years experience with molecular modeling. To sign up, contact Ken Satyshur (satyshur@wisc.edu), http://hts.wisc.edu.

Stata for Researchers

Stata is the most popular statistical software at the SSCC, as it is both very powerful and relatively easy to learn. This class will teach you the fundamentals of Stata and give you a strong foundation you can build on to become an expert Stata user. You do not need any experience with Stata to benefit from this workshop, but people who learned how to run a few Stata commands for a class or who figured out some things on their own will benefit from its broader and more rigorous approach. The material covered is also available in the SSCC Knowledge Base under Stata for Researchers. The class dates are 1/30, 2/6, and 2/13 (9:00 – 11:30 am). Note that this class is a series and you should plan on attending all of the sessions. If you are interested, you can register here for the first session and here for the second session.

Data Wrangling Sessions in R

This course teaches data preparation skills using the data wrangling tools of the tidyverse. The tidyverse is a collection of R packages that are designed to make it easier to work with data. The ggplot package is just one example of the highly regarded tools in the tidyverse. This course will cover importing data, cleaning data, creating and transforming variables, merging data, and plotting. It is a hands on class with time devoted to practicing using these tools to ready data for analysis. The course dates are 1/29, 1/31, 2/5, 2/7, 2/12,  and 2/14 (9:00 – 10:50). Note that this class is a series and you should plan on attending all of the sessions. If you are interested, you can register here.

R workshops for Researchers

UW-Madison libraries are offering R programming Workshops on R programming for researchers. The intended audience is anyone at the university who is working with tabular research data (including graduate students, faculty, research staff, and undergraduate researchers) and would like to learn how to automate data processing using the R programming language. The content is based heavily on the R Ecology Data Carpentry content, but will cover useful skills for anyone working with tabular data. Later sessions go beyond the Data Carpentry Lessons to cover how to use git version control within R Studio and writing reproducible reports using RMarkdown. There are no prerequisites for the R basics session, though those with some experience working with tabular data will get the most out of the session. All other sessions require the R basics session or some experience using R (ie, how to use the assignment operator and functions). Please email tobin.magle@wisc.edu if you have questions. Find out more and register for sessions here.

Campus Opportunities and Groups

Digital Scholarship & Publishing Office Hours

Do you have a publication or copyright question? Do you have questions about a digital humanities tool you’ve seen or about a a new project and want to start with good data management? If so, consider dropping by the weekly Digital Scholarship & Publishing Office Hours, Thursdays, 11:30 am – 1:30 pm! Experts can provide assistance with (but not limited to!) the following:

  • Publishing methods, platforms
  • Copyright, author’s rights, fair use
  • Digital humanities projects
  • Selecting or applying tools, platforms
  • Developing and planning digital projects
  • Data management and sharing

The Kickoff will be January 31 in Memorial Library Reference (2nd floor). After that, office hours run weekly from January 31st till May 2nd from 11:30AM – 1:30PM and rotate  between Memorial and Steenbock libraries. Office hours are open to anyone in the UW-Madison campus community including all students, staff, and faculty.

Computational Biology, Ecology, and Evolution (ComBEE)
ComBEE is a group of researchers at UW-Madison interested in computational biology in ecology and evolution. ComBEE offers R and Python study groups on alternating Thursdays throughout the semester.
ComBEE Python Study Group – Meets every other Thursday at 2pm in Microbial Sciences 5503
ComBEE R Study Group – Meets every other Thursday at 2pm in Microbial Sciences 5503

The ComBEE semester kick-off social is Friday, Jan 25 at 4pm in Microbial Sciences Building Room 6201

Molecular Dynamics Group
Molecular dynamics (MD) simulations can provide a computational microscope for looking at molecular events. However, the art of setting up, running, and interpreting a simulation is challenging. To help, campus MD users and potential users are getting together to share experiences, tools, and codes. Importantly, the group will also discuss best practices, appropriate/inappropriate uses, and how best to use local computer resources. Contact Spencer Ericksen (ssericksen@wisc.edu) if you are interested in joining the group.

Research Systems Administrators Group (RSAG)
This ACI-sponsored group meets on the third Wednesdays of every month and allows systems administrators of research systems to share expertise. Join the email list by sending an email message to join-rsag@lists.wisc.edu for updates and future meetings.

External Opportunities

Summer Internships at RStudio

The goal of this program is to enable RStudio employees to collaborate with students to do work that will help both RStudio users and the broader R community, and help ensure that the community of R developers is as diverse as its community of users. Over the course of the internship, you will work with experienced data scientists, software developers, and educators to create and share new tools and ideas. The internship pays approximately $12,000 USD (paid hourly), lasts up to 10-12 weeks, and will start around June 1 (depending on your availability, applications are open now, and close at the end of February. To qualify, you must currently be a student and have some experience writing code in R and using Git and GitHub.

Summer Institute in Computational Social Science

The 2019 Summer Institute in Computational Social Science will be June 16 – 29 at Princeton University. The purpose of the Summer Institute is to bring together graduate students, postdoctoral researchers, and beginning faculty interested in computational social science. The Summer Institute is for both social scientists (broadly conceived) and data scientists (broadly conceived). The co-organizers and principal faculty of the Summer Institute are Christopher Bail and Matthew Salganik. Dr. Salganik is the author of Bit by Bit: Social Research in the Digital Age. You can learn more on their website or their twitter feedApplications are due February 20, 2019.

Analytics Intern

The City of Madison Finance Department is seeking qualified applicants for a part time analytics internship to begin next semester. The position is available up to 20 hours per week and pays $18.65/hour. There is no specific end date to this internship. Applications are due by 1/31/2019.

Check calendar for latest details and updates for all listed events. If you have a relevant event or group you’d like to see included in next month’s newsletter.  Please send us an email at facilitator@datascience.wisc.edu.

If you were forwarded this email and would like to sign up to receive these emails regularly please sign up at this link.