Patrick Shea, an assistant professor in the Department of Political Science, recently received his Ph.D. from Rutgers University. His primary field is international relations. His research focuses on the political economy of war and crisis decision making, and he has forthcoming articles in the Journal of Conflict Resolution and International Studies Quarterly.

Course Content

This course extends what you did in previous methods courses by focusing on nonlinear model forms for the outcome variable. These are typically called “generalized linear models,” although for historical reasons people in political science call them “maximum likelihood models.” The principle we will care about is how to adapt the standard linear model that you know so that a broader class of outcome variables can be accommodated. These include: counts, dichotomous outcomes, bounded variables, and more. There is a strong theoretical basis for the models that we will use. Also, the bulk of the learning in the course will take place outside of the classroom by reading, practicing using statistical software, replicating the work of others, and doing problem sets. Keep in mind that the skills attained in this course are those that the discipline of political science expects of any self-declared data-oriented researcher.

The second aspect of the course is focused on the statistical package R which is completely free for downloading for Mac, Unix, Linux and that other platform at CRAN, the Comprehensive R Archive Network. R is an implementation of the S language, which is the default computational tool for research statisticians. Quite simply R is the most powerful, extensively featured, and capable statistical computing tool that has ever existed on this planet. And as mentioned, its free. We will not use Stata; don’t ask.

Course Objectives

Participants will be able to run modern nonlinear regression models.

Course Prerequisites

The only official prerequisite for this course is a linear regression class or knowledge of linear regression. However, each student should be familiar with: basic probability theory, statistical inference, hypothesis testing, and least squares estimation. The course will also assume a working knowledge of calculus and linear algebra at the level of Essential Mathematics for Political and Social Research. Jeff Gill, 2006, Cambridge University Press. Knowledge of R is assumed.

Representative Background Reading

– Leamer, Let’s Take the Con Out of Econometrics
– How Not to Lie With Statistics, by Gary King and Ellie Powell,
– Gill, The Insignificance of Null Hypothesis Significance Testing

1. Background on Models.
Reading: Leamer, Let’s Take the Con Out of Econometrics
How Not to Lie With Statistics, by Gary King and Ellie Powell,
TPM (The Political Methodologist) Volume 11, No. 2, articles: (1) Jackman, (2) Anderson, et al.,
(3) Gill (pages 20-26). Available at: The Society for Political Methodology

2 Uncertainty, Inference, and Hypothesis Testing.
Reading: Faraway, Chapter 1.
R Tutorial.
Gill, The Insignificance of Null Hypothesis Significance Testing
McCloskey, The Loss Function Has Been Mislaid: The Rhetoric of Significance Tests
Tressoldi etal. High Impact = High Statistical Standards? Not necessarily so.
Wetzels, etal. Statistical Evidence in Experimental Psychology An Empirical Comparison Using 855 t-Tests

3. The Likelihood Model of Inference.
Reading: Faraway, Appendix A.
Binomial PMF likelihood grid search Model syntax summary.

4. Models for Dichotomous Outcomes.
Reading: Faraway, Chapter 2.
Altman, The cost of dichotomising continuous variables.

5. Models for Count Outcomes.
Reading: Faraway, Chapter 3.
Statistical Models for Political Science Event Counts: Bias in Conventional Procedures and Evidence for the
Exponential Poisson Regression Model, Gary King

6. Models for Contingency Tables.
Reading: Faraway, Chapter 4.

7. Models For Ordered and Unordered Categorical Data.
Reading: Faraway, Chapter 5.
Multinomial Probit and Logit: A Comparison of Choice Models for Voting Research, by Jay K. Dow and
James W. Enderby.

8. How to Handle Missing Data in Models. The EM Algorithm and Multiple Imputation.
Reading: mice: Multivariate Imputation by Chained Equations by Stef van Buuren and Karin Groothuis-Oudshoorn,

9. The GLM Theory and the Exponential Family Form.
Reading: Faraway, Chapter 6.
The Epic Story of Maximum Likelihood, by Stephen M. Stigler.

10. Wrap up and Case Study.