Patrick Shea is an assistant professor in the Department of Political Science at the University of Houston. His research interests are international relations, the political economy of conflict, and statistical inference. His research can be found in the Journal of Conflict Resolution, International Studies Quarterly, Economics & Politics, and Statistics, Politics and Policy among other journals. He received his PhD from Rutgers University.
This course introduces statistical models called generalized linear models (GLM’s). GLM’s encompass an incredibly flexible family of models, which extend to a broader class of outcome variables than OLS can. We will consider how to perform regression analysis with the following types of outcome variables: continuous, counts, dichotomous outcomes, categorical outcomes, duration, and more. In addition, we will analyze mixture models, which combine two or more of model processes, such as zero-inflation count model, tobit models, and selection models. GLM’s are widely used across the social sciences to gain empirical traction upon all sorts of questions. We will focus on how to apply these models to a range of data, assess the models, and interpret and present the results. The biggest payoff from this course will likely come from the substantive work you can do by unleashing generalized linear models into social science questions – work which you cannot properly do with a simple linear model.
The main goal of this course is to help you make progress towards becoming a well-informed user and consumer of generalized linear models. We will focus on (1) how to appropriately translate research questions into statistical models for non-linear problems, (2) estimating GLM parameters using the maximum likelihood principle, and (3) interpreting results and identifying limitations of non-linear regression models. The first aspect of this class focuses on understanding the unified theoretical basis for the using GLM’s. Emphasis will be placed on building from standard linear models, extending the linear model to GLM’s, and going beyond GLM’s. The second aspect of the course is focused on using the statistical package R to model GLM’s. R is a powerful and capable statistical computing tool. And it’s free! The skills attained in this course are the foundation for any social science data science toolkit.
The background required for the course is a good introduction to probability and statistical inference, some experience with linear regression, and some knowledge of a statistical software program like R or Stata.
Representative Background Reading
Gailmard, Sean. Statistical modeling and inference for social science. Cambridge University Press, 2014.
Faraway, Julian J. 2016. Extending the Linear Model with R: Generalized Linear, Mixed Effects, and Nonparametric Regression Models. Chapman & Hall/CRC. 2nd Edition
Background knowledge required
OLS = moderate
Maximum Likelihood = elementary
Stata = elementary
R = elementary