Projects

Instructions

Overview

The best way to learn data analysis is to do it. As such, you will apply the skills and concepts learned in this course to answer a question of interest to you. The objective is to get you working with data on a research project as quickly as possible, even if it is a imperfect project (hint: all research is imperfect). Due to the limited time in this course, it is not necessary for this project to address an important research problem or a novel contribution to the literature. While those will not be criteria for the evaluation of this paper, you are encouraged to pursue those if you have them, as those ideas are what lead to publications. As such, the project will be evaluated on the appropriateness of the statistical methods applied to the data and question, and not the novelty or contribution of the question itself.

The research project can be an original question or a replication. If you developed a research design for POLS 500, you may be able to use it in 501. However you need to confirm that you will be able to assemble a dataset to test a specific research hypothesis within the time constraints of this course, because you will be using it throughout the course. If that seems unlikely, you will need to choose a different project. The key constraint for the project is the immediate availability of data. Some messiness and merging of a datasets will likely be fine, and that is one of the subjects covered in this course. However, if the project requires collecting new data, you should not pursue it for this course.

The final output of this project will use methods and introduced in this course to answer a well-defined research question (descriptive, predictive, or causal). The final product will be a reproducible R Markdown document combining code and description.

The purpose of requiring an R Markdown document rather than a full-research is both to two-fold. First, it focuses on the value added of this course, since your other courses will ask you to write papers, but not focus on statistical and reproducible research methods. Second, it will be shorter and thus fit in the ten weeks of the quarter.

You may work on this project singly or with a single co-author. This can be a topic or question that you are using in another course.

Students will work on the project throughout the quarter, with a series of assignments:

Git Submission Process

Schedule

Additionally there will be check-ins in weeks

Schedule

Due date
Proposal January 16
Proposal II January 23
Data Analysis I (Data Wrangling and EDA) February 6
Data Analysis 2 February 20
Draft March 6
Project March 16