CS&SS 321 - Data Science and Statistics for Social Sciences I
Undergraduate course, University of Washington, 2024
This post comprises my teaching assistant materials for the quiz sections of an undergraduate introductory course in data science with R/RStudio, instructed by Professor Caitlin Ainsley. See my teaching assistant syllabus for the sections contents and policy.
Overview
The section contents are divided into modules dedicated to “best practices” in R programming, theory review, data wrangling, visualization, and statistical analyses in R, consolidating techniques learned in lectures and QSS tutorials while introducing new skills relevant to the course contents. All lab materials can be found in this page, including Zoom recordings of each lab section.
I will offer office hours by appointment in Zoom. Please note that it may take me up to 24 hours to respond to a student’s email, so it is a good idea to plan ahead and email me in advance. When you email me, please include (1) the topic you would like to discuss and (2) your time availability for scheduling a meeting.
Slack channel
Students can join the course Slack workspace in the following link.
A portion of the student’s final grade depends on section participation, which I will monitor through Slack. Slack, designed for team communication, collaboration, and project management, organizes communication into channels, facilitating real-time messaging. Slack is the most preferred communication channel, which allows users to insert code block in their messages. It has the added benefit of facilitating knowledge spillover through peer discussion and mutual assistance.
In some quiz sections, we will use Slack for sharing R code answers and collaborative problem-solving during in-section data analysis exercises. Participation is also considered when addressing questions or bug errors shared in Slack, either by you or your peers (see more in the syllabus).
Module’s contents
Find below the materials necessary for each module’s quiz section.
Module 1 - Getting started with R
/R
Studio.
This module offers an overview of basic R functions and introduces the R-Studio interface. It covers installing R/RStudio and relevant packages, creating R projects, managing working directories, introducing R Markdown, and creating and sharing minimal, reproducible examples for programming assistance via Slack.
To run the QSS tutorials, install all necessary packages and dependencies found in the installing_packages.R script file. In here you can access a short recording on how to run QSS tutorials in RStudio.
- R scripts/RMarkdown files
- Datasets
Module 2 - Data management and exploratory visual analysis.
This module equips students with essential computing skills for data science to successfully complete course assignments and their final projects. Some of the topics include creating and manipualting data frames, logical tests, subsetting, NA data, pivoting and merging datasets, and intro to visualization with ggplot2.
I made an extra lab recording that reviews some of the topics from the previous model while it also includes new functions from this one.
- R scripts/RMarkdown files
- Datasets
Module 3 - Introduction to causal inference and linear models.
This module introduces students to causal inference and the linear model. Topics include the distinctions between experimental and observational designs, prediction, the method of least squares, standard errors, and the interpretation of confidence intervals. While theory will be reviewed, the approach will be predominantly computational and visual.
- R scripts/RMarkdown files
- Datasets
Module 4 - Hypothesis tests and multivariate regression analysis
This module begins by explaining statistical inference and the importance of expressing uncertainty in our inferences. We will cover the construction and interpretation of hypothesis tests relevant to research questions, along with topics in multivariate regression analysis, including transformations, nonlinear relationships, interaction effects, and the interpretation of categorical predictors.
R scripts/RMarkdown files
Datasets
Assignments
Find below the code solutions of the midterm and the four problem sets.
- Review of problem set 1.
- Code solutions for problem set 1.
- Code solutions for problem set 2.
- Code solutions for problem set 3.
- Code solutions for problem set 4.
- Code solutions for midterm, part I.
Quiz section recordings
Note: The recorded sessions have poor audio quality and are not intended to replace your in-person attendance at quiz sections.
Week | Module | Monday | Wednesday |
---|---|---|---|
2 | 1 | 01/08 | 01/10 |
3 | 1/2 | Extra lab | 01/17 |
4 | 2 | 01/22 | 01/24 |
5 | 2/3 | 01/29 | 01/31 |
6 | 3 | 02/05 | 02/07 |
7 | 3 | 02/12 | 02/14 |
8 | 3 | No sections | 02/21 |
9 | 4 | 02/26 | 02/28 |
10 | 4 | 02/04 | 02/06 |