Schedule

The following schedule is a tentative guideline and will be evolving to meet course needs up to and during the quarter. In particular, the pace of the course will be adjusted so that we move as possible conditional on everyone “getting” the material.

Students should do the readings before the class. Some class time will be spent on lecture, but the majority of class time consist of discussion, questions, and problem solving. Lab sections will cover how to implement the analysis in R, conceptual review and continued elaboration of the week’s material, and time for students to discuss and get feedback on their projects.

  • SMISS = Statistical Modeling and Inference for Social Science
  • OpenIntro = Open Intro Statistics

Key: Readings; Data Camp courses; Actions to take; R lessons; Assignments.

A list of changes is available here.

Week 1

Introduction

Pre-class

  • Complete the class survey and statistical knowledge pre-test
  • Download R and RStudio. Instructions are here.
  • Complete Data Camp Introduction to R Chapters 1 (Intro to Basics), 2 (Vectors), 4 (Factors), and 5 (Data Frames).
  • (Optional) Complete TryR if you want more practice with R.
  • SMISS, Ch 1
  • Peng, Roger D., and Elizabeth Matsui. 2015. The Art of Data Science. Chapters 1–2. [URL] (the minimum price is free)
  • Hadley Wickham interview [URL].
  • R is for Everyone, "Preface"

Class

Lab

Week 2

Observable Data, Descriptive Statistics and Exploratory Data Analysis

Class

Lab

Useful references for data visualization

  • Gelman, Andrew. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician. [URL]
  • Kastellec, Jonathan P., and Eduardo L. Leoni. “Using Graphs Instead of Tables in Political Science.” Perspectives on Politics. [DOI] (Also see associated website tables2graphs.com.)
  • Stephen Few perceptualedge.com and his books: Show Me the Numbers: Designing Tables and Graphs to Enlighten and Now You See It: Simple Visualization Techniques for Quantitative Analysis. As well as many of his articles.
  • William S. Cleveland’s books are classics: The Elements of Graphing Data and Visualizing Data
  • Edward Tufte’s books are also classics: The Visual Display of Quantitative Information, Visual Explanations, Envisioning Information, and Beautiful Evidence.
  • Christopher Adolph’s course CSSS 569: Visualizing Data.
  • Heer, Jeffrey, and Michael Bostock. 2010. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”. This is a gateway to experiments which assess how effectiveness of different visual encodings of data. [URL].
  • Schwabish, Jonathan A. 2014. “An Economist’s Guide to Visualizing Data.” Journal of Economic Perspectives [DOI]

Week 3

Probability; Data wrangling and tidy data.

  • Research project assignment 2 due
  • Readings assignment 2 due
  • OpenIntro, Ch 2.
  • SMISS Ch. 3, "Observable Data and Data-Generating Processes". This reading provides the context for the use of probability within political science research.
  • SMISS Ch. 4.1.3, "Ontological Interpretations of Probability".
  • The Monty Hall Problem. Setosa.io. An interactive visualization.
  • Conditional Probability. Setosa.io. An interactive visualization.
  • Wickham, Hadley. 2014. "Tidy Data." Journal of Statistical Software. [URL].
  • "Introduction to dplyr" [[URL](https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html)]
  • SMISS Ch. 4, "Probability Theory: Basic Properties of Data-Generating Processes". Optional. This goes beyond the coverage of probability in OpenIntro.
  • Hadley Wickham dplyr tutorial at useR! 2014: Part 1, Part 2.
  • Data Camp Data Manipulation in R with dplyr
  • Optional. This can clarity the information in the vignette.

Class

Lab

Optional readings

Week 4

Distributions; Data Generating Process

  • Reading assignment 3 due
  • OpenIntro, Ch 3
  • SMISS. "Chapter 5: Expectation and Moments: Summaries of Data Generating Processes" Sections 5.1, 5.2.1.
  • SMISS. "Chapter 6: Linking Positive Theories and Data-Generating Processes." Section 6.1–6.3, 6.5–6.6.

Class

Lab

  • Gentzkow, Matthew, and Jesse M. Shapiro. 2014. “Code and Data for the Social Sciences: A Practitioner’s Guide.” [link]
  • Jenny Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, Tracy Teal, and Greg Wilson. 2015. Good Enough Practices in Scientific Computing

Week 5

Foundations of Inference: Sampling distributions, confidence intervals, hypothesis testing

Class and Lab

Lab

Week 6

Inference for Numerical Data (t-tests, ANOVA); Bootstrap.

  • Numerical Inference in R (t-tests and ANOVA)
  • Reading assignment 5 due
  • OpenIntro, Ch 5
  • Cowles, Michael, and Caroline David. 1982. "On the Origins of the .05 Level of Statistical Significance"American Psychologist. DOI
  • Moore, McCabe, and Craig. Introduction to the Practice of Statistics.7th ed. Chapter 16. "Bootstrap Methods and Permutation Tests"URL.
  • Gelman, Andrew, and Hal Stern. 2006. ‘The Difference Between ‘Significant’ and ‘Not Significance’ is not Itself Statistically Significant’The American Statistician. DOI
  • Reinhart, Alex. 2012. Statistics Done Wrong: The Woefully Complete Guide. Chapters 1, 2, 4, and 5.

Week 7

Inference for Categorical data (proportion tests, Chi-squared tests)

Class

Week 8

Introduction to Linear Regression

Lab

Week 9

Multiple regression

  • Reading assignment 8 due
  • OpenIntro. Ch 8: “Multiple and Logistic Regression”
  • OpenIntro Supplement: “Interaction Terms” URL
  • OpenIntro Supplement: “Regression for nonlinear relationships” URL
  • Brambor, Thomas, William Roberts Clark, and Matt Golder. 2005. "Understanding Interaction Models: Improving Empirical Analysis." Political Analysis. DOI
  • Matt Golder. "Interactions". URL

Lab

Week 10

Finals Period

The final paper is due on March 16 17:00 PDT