TRIPODS Data Science Boot Camp

Jun 8-12, 11am-12.30 EDT

The DATA-INSPIRE TRIPODS Institute, in collaboration with the DIMACS REU program, organizes a Data Science boot camp for training of students in data science concepts. The instructor will be Prof. Matthew Stone.

In this series of interactive workshops, the participants will experiment with python notebooks for data analysis. The focus is on using notebooks to visualize, analyze and understand data sets for research. Diverse examples of visualization techniques will be studied, and there will be an introduction to computational, mathematical and statistical principles that underwrite effective data visualizations. The work will be based within the general domain of text data, but will also cover a number of different genres including tweets, reviews, news and crowd-sourced language.

Agenda

Monday Jun 8.
– Designing visualizations to explore data sets.
– Mapping geolocated twitter data.
– Recording: [Watch] [Download] (Password: 2TrvKvHy)

Tuesday Jun 9.
– Visualizing statistical evidence in data sets.
– Linguistic differences among communities and authors.
– Recording: [Watch] [Download] (Password: Pn2uJhAU)

Wednesday Jun 10.
– Visualizing and understanding data distributions.
– Topic analysis of news articles.
– Recording: [Watch] [Download] (Password: vQr6sVnm)

Thursday Jun 11.
– Visualizing and understanding model fits.
– Modeling the meanings of English color descriptions.
– Recording: [Watch] [Download] (Password: Yx2nWCf5)

Friday Jun 12
– Making and understanding predictions.
– Linear text classifiers for review sentiment analysis.
– Recording: [Watch] [Download] (Password: nSHvMtY2)

The goal of the boot camp is that these examples will provide some ideas, programming models, and design patterns to support the students’ own data-driven research.

Registration

Tutoring will take place online via webex meetings and will use google colab to explore data interactively.

To register for the webex meetings go to: https://rutgers.webex.com/rutgers/k2/j.php?MTID=t0f79cb4cb58376c9913ee6d695dab5e1
Approval is automatic and you can use any email address you like. Registration is necessary so that webex can show your names to the group and allocate you to breakout sessions. Once you register, you will get an email with specific instructions about how to join the meeting. It is recommended to “clean up” your open applications and tabs so that you don’t accidentally share personal information when you lead group work in a breakout session.

Google colab is nicely integrated with google drive, which is an easy way to share notebooks, data sets and other resources for the class.
Here is the link that you can use to obtain materials: https://go.rutgers.edu/duxwz6f3
You’ll want to copy this directory into your own google drive, and you’ll then be able to access all the files from within colab