Soc504: Advanced Social Statistics

Read the Syllabus

Sociology 504 is the second class in a two-semester statistics sequence for graduate students in Sociology.  We also welcome undergraduates and graduate students from other departments.  The course assumes material covered in Soc500 and the Princeton Sociology Summer Methods Camp.  Soc504 covers maximum likelihood estimation, generalized linear models and assorted topics.

Two Important Notes:
1) Credit
My personal philosophy on teaching preparation is that it is best to stand on the shoulders of giants; that is, I would rather spend several hours improving/tweaking/remixing a set of already strong slides than recreating some from scratch just so they are completely unique.  Thankfully, I have access to a network of generous scholars who have been willing to share their materials. Many of the slides linked below are either taken directly from others or are adapted from their original design- I, of course, take responsibility for any errors that remain.  

The Soc504 course design is heavily influenced by Gary King's course: Gov2001. The first half tracks tightly with Gary's course and the second half covers additional topics. I have also drawn material from Matt Blackwell, Justin Grimmer, Erin Hartman, Teppei Yamamoto and others.  All of these scholars have kindly allowed me to post here. Whenever material is drawn from someone they are credited at the bottom of the title slide or as a one-off on the individual slide where their material is used (slides from precepts sometimes draw from the corresponding lecture slides without attribution). If you believe your material was used here without attribution, please reach out to me and let me know so I can correct it.

This class is not sustainable without great teaching assistants.  I was lucky this year to have two incredible teaching assistants: Rebecca Johnson and Ian Lundberg.

2) Style and Form
The course is split into two halves: the first half (before spring break) covers core material in GLMs and MLE.  The second half covers three modules that will hopefully be useful to people moving forward.  

The course was taught twice a week for an hour and a half with one two hour "lab" each week taught by teaching assistants.  In Soc500 each lecture covers a week of material and is split discretely into individual classes. In Soc504, I cover material more fluidly in the first half of the course, stopping wherever we are at the end of class (I've denoted the days we actually covered the material below for others who might want to judge pacing).  In the second half of the semester, I cover material in two week chunks with each class more sharply defined inside the lecture slides.

As with the previous class, I talk very quickly which is why we cover so much ground.  Stylistically, I see class as an opportunity to expose people to new ideas and it is through the weekly problem sets and precepts that the material is really solidified.  So if the pace seems almost inconceivably fast, that's why.

Handouts (2017):
The class is geared towards the replication and extension project in which pairs of students choose a paper of interest to them in the literature, replicate and extend it to create something new.  We have some materials designed to help them do that.

Lectures (2017): 
Unlike Soc500 I don't currently have handout version of these slides; it proved too difficult to get the animations to look right. If you see a typo- please email me!

Lecture 1: Introduction - February 6
slides

Lecture 2: Basics - February 8
slides

Precept 1: Review of Probability, Simulations and Data Manipulation - February 9
(Rebecca Johnson)
slides 

Lecture 3: Maximum Likelihood Estimation - February 13-20
slides

Precept 2: Likelihood Inference - February 16
(Ian Lundberg)
slides

Precept 3: Numerical Optimization and Simulation - February 23
(Rebecca Johnson)
slides

Lecture 4: Generalized Linear Models - February 22-March 15
slides

Precept 4: Binary and Lognormal GLMs - March 2
(Ian Lundberg)
slides

Precept 5: Binary and Ordinal Outcomes - March 8
(Rebecca Johnson)
slides

Precept 6: Duration and Count Data - March 16
(Ian Lundberg)
slides

Spring Break

Lecture 5: Latent Variables, EM and Missing Data - March 27-April 5
topics: mixture models, expectation maximization, missing data, multiple imputation
slides

Precept 7: Mixture Models and EM - March 28
(Rebecca Johnson)
slides

Precept 8: Missing Data - April 6
(Ian Lundberg)
slides

Lecture 6: More Causal Inference - April 10-19
topics: model dependence, matching, propensity scores, mediation
slides

Precept 9: Model Dependence and Matching - April 12
(Rebecca Johnson)
slides

Precept 10: Matching, mediation and dynamic treatments - April 20
(Ian Lundberg)
slides

Lecture 7: Regularization - April 24-26
topics: regularization, eight schools example, hierarchical models
slides

Lecture 8: Wrap-up - May 1
slides