The following schedule is a tentative guideline and will be evolving to meet course needs up to and during the quarter. In particular, the pace of the course will be adjusted so that we move as possible conditional on everyone “getting” the material.
Students should do the readings before the class. Some class time will be spent on lecture, but the majority of class time consist of discussion, questions, and problem solving. Lab sections will cover how to implement the analysis in R, conceptual review and continued elaboration of the week’s material, and time for students to discuss and get feedback on their projects.
- SMISS = Statistical Modeling and Inference for Social Science
- OpenIntro = Open Intro Statistics
Key: Readings; Data Camp courses; Actions to take; R lessons; Assignments.
A list of changes is available here.
Week 1
Introduction
Pre-class
- Complete the class survey and statistical knowledge pre-test
- Download R and RStudio. Instructions are here.
- Complete Data Camp Introduction to R Chapters 1 (Intro to Basics), 2 (Vectors), 4 (Factors), and 5 (Data Frames).
- (Optional) Complete TryR if you want more practice with R.
- SMISS, Ch 1
- Peng, Roger D., and Elizabeth Matsui. 2015. The Art of Data Science. Chapters 1–2. [URL] (the minimum price is free)
- Hadley Wickham interview [URL].
- R is for Everyone, "Preface"
Class
Lab
Week 2
Observable Data, Descriptive Statistics and Exploratory Data Analysis
- Research project assignment 1 due
- Readings assignment 1 due
- Data Camp Data Visualization with ggplot2 (1)
- OpenIntro, Ch. 1, "Introduction to data."
- SMISS. Ch. 2 "Descriptive Statistics: Data and Information," Sections 2.1, 2.2, 2.3.1--2.3.3.
- Wickham, Hadley. 2010. "A Layered Grammar of Graphics." Journal of Computational and Graphical Statistics. [doi]
- Tukey, John. W. 1980. "We Need Both Explanatory and Confirmatory." The American Statistician. [doi]
Class
Lab
Useful references for data visualization
- Gelman, Andrew. 2002. “Let’s Practice What We Preach: Turning Tables into Graphs.” The American Statistician. [URL]
- Kastellec, Jonathan P., and Eduardo L. Leoni. “Using Graphs Instead of Tables in Political Science.” Perspectives on Politics. [DOI] (Also see associated website tables2graphs.com.)
- Stephen Few perceptualedge.com and his books: Show Me the Numbers: Designing Tables and Graphs to Enlighten and Now You See It: Simple Visualization Techniques for Quantitative Analysis. As well as many of his articles.
- William S. Cleveland’s books are classics: The Elements of Graphing Data and Visualizing Data
- Edward Tufte’s books are also classics: The Visual Display of Quantitative Information, Visual Explanations, Envisioning Information, and Beautiful Evidence.
- Christopher Adolph’s course CSSS 569: Visualizing Data.
- Heer, Jeffrey, and Michael Bostock. 2010. “Crowdsourcing Graphical Perception: Using Mechanical Turk to Assess Visualization Design”. This is a gateway to experiments which assess how effectiveness of different visual encodings of data. [URL].
- Schwabish, Jonathan A. 2014. “An Economist’s Guide to Visualizing Data.” Journal of Economic Perspectives [DOI]
Week 3
Probability; Data wrangling and tidy data.
- Research project assignment 2 due
- Readings assignment 2 due
- OpenIntro, Ch 2.
- SMISS Ch. 3, "Observable Data and Data-Generating Processes". This reading provides the context for the use of probability within political science research.
- SMISS Ch. 4.1.3, "Ontological Interpretations of Probability".
- The Monty Hall Problem. Setosa.io. An interactive visualization.
- Conditional Probability. Setosa.io. An interactive visualization.
- Wickham, Hadley. 2014. "Tidy Data." Journal of Statistical Software. [URL].
- "Introduction to dplyr" [[URL](https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html)]
- SMISS Ch. 4, "Probability Theory: Basic Properties of Data-Generating Processes". Optional. This goes beyond the coverage of probability in OpenIntro.
- Hadley Wickham dplyr tutorial at useR! 2014: Part 1, Part 2.
- Data Camp Data Manipulation in R with dplyr Optional. This can clarity the information in the vignette.
Class
Lab
Optional readings
- Garrett Grolemund and Hadley Wickham. R for Data Science. Chapters "Data transformation", "Tidy Data", and "Relational Data".
Week 4
Distributions; Data Generating Process
- Reading assignment 3 due
- OpenIntro, Ch 3
- SMISS. "Chapter 5: Expectation and Moments: Summaries of Data Generating Processes" Sections 5.1, 5.2.1.
- SMISS. "Chapter 6: Linking Positive Theories and Data-Generating Processes." Section 6.1–6.3, 6.5–6.6.
Class
Lab
- Gentzkow, Matthew, and Jesse M. Shapiro. 2014. “Code and Data for the Social Sciences: A Practitioner’s Guide.” [link]
- Jenny Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, Tracy Teal, and Greg Wilson. 2015. Good Enough Practices in Scientific Computing
Week 5
Foundations of Inference: Sampling distributions, confidence intervals, hypothesis testing
- Research project assignment 3 due
- Reading assignment 4 due
- OpenIntro, Ch 4
- SMISS. 7.1–7.4, (skim)
- Central Limit Theorem Visualized in D3". Setosa.io. An interactive visualization.
- OpenIntro Shiny apps for Central Limit Theorem for Means, Central Limit for Proportions.
- Hadley Wickham. Advanced R Style Guide.
Class and Lab
- Loops in R
- Sampling Distributions through Simulation in R
- Confidence Intervals of the Mean through Simulation in R
- Hypothesis Tests of the Mean through Simulation in R
Lab
Week 6
Inference for Numerical Data (t-tests, ANOVA); Bootstrap
.
- Numerical Inference in R (t-tests and ANOVA)
- Reading assignment 5 due
- OpenIntro, Ch 5
- Cowles, Michael, and Caroline David. 1982. "On the Origins of the .05 Level of Statistical Significance"American Psychologist. DOI
- Moore, McCabe, and Craig. Introduction to the Practice of Statistics.7th ed. Chapter 16. "Bootstrap Methods and Permutation Tests"URL.
- Gelman, Andrew, and Hal Stern. 2006. ‘The Difference Between ‘Significant’ and ‘Not Significance’ is not Itself Statistically Significant’The American Statistician. DOI
- Reinhart, Alex. 2012. Statistics Done Wrong: The Woefully Complete Guide. Chapters 1, 2, 4, and 5.
Week 7
Inference for Categorical data (proportion tests, Chi-squared tests)
- Categorical Inference in R (proportion and Chi-squared tests)
- Reading assignment 6 due
- OpenIntro. Ch 6: "Inference for Categorical Data"
- Data Camp Reporting with R Markdown
Class
Week 8
Introduction to Linear Regression
- Project assignment 4 due
- Reading assignment 7 due
- OpenIntro. Ch 7: "Introduction to Linear Regression"
- SMISS. Sections 2.3.4, 2.3.5, Ch 2 Appendix, 6.6.3, 6.7, 7.5
Lab
Week 9
Multiple regression
- Reading assignment 8 due
- OpenIntro. Ch 8: “Multiple and Logistic Regression”
- OpenIntro Supplement: “Interaction Terms” URL
- OpenIntro Supplement: “Regression for nonlinear relationships” URL
- Brambor, Thomas, William Roberts Clark, and Matt Golder. 2005. "Understanding Interaction Models: Improving Empirical Analysis." Political Analysis. DOI
- Matt Golder. "Interactions". URL
Lab
Week 10
Finals Period
The final paper is due on March 16 17:00 PDT