Data Science Summer Hangouts Series 2023

RDDSX space outside the Collaborative Classroom
Van Pelt-Dietrich Library

DDDI's 2023 Summer Hangouts program offers students the opportunity to participate in informal, hands-on tutorials led by our team of DDDI postdoctoral research fellows. These tutorials are open to students from all backgrounds and skill levels and cover various data science methods and topics. Hangouts will be held once a week on Thursdays, starting from July 20th until August 17th. We'll also provide a pizza lunch between tutorial sessions. All tutorials will take place in the newly renovated RDDSX space, conveniently located near the Collaborative Classroom in Van Pelt-Dietrich Library.

Our Hangouts series this year will be a "choose your own adventure" through data science. Students are welcome to attend tutorials as best fit their time and interests: no prior registration required.


Live Stream

Recordings

Watch recordings of the sessions


Schedule

Hangouts will run once a week on Thursdays from July 20th through August 17th. A pizza lunch will be served between tutorial sessions.

 

July 20

TimeInstructorTitleDescription
11:00 - 12:00Colin Twomey

Coding for Data Science

slides and notebook

What's a programming language, and why use one for data science? What's a coding assistant, and what can (and can't) it do for you? In this session, we'll give a big picture introduction to programming for data science that should be accessible to everyone. No prior coding experience required.
12:00 - 12:30 Lunch 
12:30 - 1:30Sarah Lee

First Steps with R

notebook

In this tutorial, we'll walk through some practical steps for working with R for the first time: covering data organization in spreadsheets, importing data into R, inspecting data, data wrangling, and very basic visualization.

 

July 27

TimeInstructorTitleDescription
11:00 - 12:00Vlad AyzenbergFirst steps with Python
notebook

Similar to the "First steps with R" tutorial last week, we'll take an introductory walk through Python for data science.

 

12:00 - 12:30 Lunch 
12:30 - 1:30Sam Dillavou + Kieran MurphyIntroduction to Machine Learning
notebook
Peeling away the recent buzz around AI, how do you make a machine learn? How is it different than how humans learn? We'll go over the basics with some simple machine learning models. Python will be used but not necessary for understanding the core concepts.

 

August 3

TimeInstructorTitleDescription
11:00 - 12:00Jingye YangIntroduction to LLMs
 
Ever wonder how ChatGPT seem to 'understand' what you're saying? This friendly talk will dive into the world of large language models. We'll break down how these AI brains learn to chat just like us, their cool uses (think helping with work or writing a novel), and the challenges we face with them, like making sure they're used responsibly.
12:00 - 12:30 Lunch 
12:30 - 1:30Carlos Schmidt-Padilla

Introduction to Causal Inference

in person only

In this tutorial, we will provide a comprehensive explanation of causal inference, which involves understanding and determining the cause-and-effect relationship between variables or events. We will explore how this concept is applicable to various fields, including technology (such as A/B testing), health sciences, and the social sciences. By delving into these applications, we will gain insights into how causal inference plays a crucial role in understanding and making informed decisions based on causal relationships in diverse domains.

 

August 10

TimeInstructorTitleDescription
11:00 - 12:00Chang-Yu Chang

The Art of Naming Your Pets and R Objects

slides and notebook

Have you ever found yourself running out of English letters to name your R objects? Fear not! We'll dive into the world of naming conventions for R objects and functions. We'll also introduce you to a basic structure for managing data science projects, ensuring a smooth and organized workflow.
12:00 - 12:30 Lunch 
12:30 - 1:30Brynn ShermanData Visualization with R and ggplotIn this tutorial, we'll discuss tips and tricks for data visualization in R. We'll start with a brief refresher of R and the basics of ggplot. We'll then go more in depth about how to customize various aspects of graphs, decide on color schemes, etc., all with the goal of making your graphs as clear and visually appealing as possible.

 

August 17

TimeInstructorTitleDescription
11:00 - 12:00Sergey Molodtsov

Python data analysis and visualization

notebook

This tutorial will provide basic concepts of data processing and visualization. We will use multidimensional climate and geoscience datasets to demonstrate the simple statistical analysis and the plotting capabilities of Python. We will also cover the different libraries available in Python to work with large multidimensional arrays.
12:00 - 12:30 Lunch 
12:30 - 1:30Sunghye ChoIntroduction to Automatic Speech Recognition
slides and notebook
Have you ever wondered how Siri or Alexa understands what you say? In this tutorial, we will walk through very basics of Automatic Speech Recognition (ASR). At the end of the tutorial, we will learn how to run one of the state-of-the-art models, Whisper, on audio files and how to evaluate the model's performance, using Python.
1:30 - 2:30Erçağ Pinçe

Image processing and Computer Vision with Deep Learning Tools

slides and notebook

In this tutorial, we will delve into how neural networks and machine learning frameworks can be harnessed to process and track images of microscopic objects. Our journey will commence with an exploration of DeepTrack 2.0, an extensive open-source deep-learning framework tailored for digital microscopy within the Python environment. The tutorial will then branch into the realm of 3D tracking, where we'll explore the complexities of monitoring moving microscopic elements in a three-dimensional space and several caveats that come with this. Then, we'll venture into label-free tracking, a particularly intricate tracking task. Finally, we’ll examine how this novel microscopy approach can unlock new insights into bacterial microecology, motility research, and the domain of active matter more generally