Patrick Shea is an assistant professor in the Department of Political Science at the University of Houston. His research interests are international relations, the political economy of conflict, and statistical inference. His research can be found in the Journal of Conflict Resolution, International Studies Quarterly, Economics & Politics, among other journals. He received his PhD from Rutgers University.

Course Content: This course is an introduction to generalized linear models (GLM’s). GLM’s encompass an incredibly flexible family of models, which can be estimated via maximum likelihood. We adapt the standard linear model to a broader class of outcome variables. We will consider how to perform regression analysis with the following types of outcome variables: continuous, counts, dichotomous outcomes, ordered categorical outcomes, unordered categorical outcomes, duration, and more. These models are widely used across the social sciences to gain empirical traction upon all sorts of questions. The biggest payoff from this course will likely come from the substantive work you can do by unleashing generalized linear models into social science questions – work which you cannot properly do with a simple linear model.

Course Objectives: The main goal of this course is to help you make progress towards becoming a responsible and well informed user and consumer of GLM models – a required skill for virtually any empirically-minded social scientist. The first aspect of this class focuses on understanding the unified theoretical basis for the using GLM. Emphasis will be placed on building from standard linear models, extending the linear model to GLMs, and going beyond GLMs. The second aspect of the course is focused on using the statistical package R to model GLMs. R is a powerful and capable statistical computing tool. And it’s free. The skills attained in this course are those that social science disciplines expect of any self-declared data-oriented researcher.

Course Prerequisites: The background required for the course is a good introduction to probability and statistical inference, at least one good linear regression course, and working knowledge of R or Stata. If you are taking this course, it would be ideal to be minimally comfortable with elementary calculus (at the level of Daniel Kleppner and Norman Ramsey, Quick Calculus. A Self-Teaching Guide, 2nd edition), elementary matrix algebra (at the level of Krishnan Namboodiri, Matrix Algebra: An Introduction), and the basics of probability theory/probability distributions (see Evans, Hastings and Peacock, Statistical Distributions). Jeff Gill’s Essential Mathematics for Political and Social Research (Cambridge University Press) is another excellent resource. If you are not confident about your calculus, your probability, or to a lesser extent, your matrix algebra, go over these texts. I will assume that your math skills may be rusty, but you will get more out of this class if you sharpen your math and probability skills before the class starts.

Representative Background Reading:
Gailmard, Sean. Statistical modeling and inference for social science. Cambridge University Press, 2014.

Core Reading:
Faraway, Julian J. 2016. Extending the Linear Model with R: Generalized Linear, Mixed Effects, and Nonparametric Regression Models. Chapman & Hall/CRC. 2nd Edition

• Each class begins with a short lecture over the class material (approximately 60 – 75 minutes).
• After each lecture, I will then provide an overview of some GLM /MLE applications in R (approximately 60 minutes).
• The remaining portion of class will be devoted to hands on learning with real data examples.
• The course schedule section, which is below, provides even more details about the topic of the lecture for each class day and citations for relevant required readings (which will be provided).

1. Introduction: Uncertainty and Data-Generating-Processes (DGP)
Reading:
– Moore and Siegel (2013) A Mathematics Course for Political & Social Research Princeton University Press. Chapters 9-11.
– Gailmard, Sean. Statistical modeling and inference for social science. Cambridge University Press, 2014. Chapter 3

2. OLS, Inference, and Hypothesis Testing.

Reading:
– Faraway, Chapter 1.
– Gill, The Insignificance of Null Hypothesis Significance Testing
– Gailmard. Statistical modeling. Chapter 7&8.

3. Introduction to GLM and MLE
Reading:
– Faraway, Appendix A.
– Aldrich, John. 1997. “R. A. Fisher and the Making of Maximum Likelihood 1912- 1922.” Statistical Science 12(3): 162-176.

4. Models for Dichotomous Outcomes.
Reading:
– Faraway, Chapter 2.
– Altman, The cost of dichotomizing continuous variables.

5. Binary Response Models: Advanced Topics
Reading:
– Esarey, Justin and Andrew Pierce. 2012. “Assessing Fit Quality and Testing for
Misspecification in Binary-Dependent Variable Models.” Political Analysis 20(4):480-500.
– Greenhill, B, Ward, MD, and Sacks, A. “The Separation Plot: A New Visual Method for Evaluating the Fit of Binary Models.” American Journal of Political Science 55.4 (October 1, 2011): 991-1002

6. Models for Count Outcomes.
Reading:
-Faraway, Chapter 3.
-Statistical Models for Political Science Event Counts: Bias in Conventional Procedures and Evidence for the Exponential Poisson Regression Model, Gary King

7. Models For Ordered and Unordered Categorical Data.
Reading:
-Faraway, Chapter 5.
-Multinomial Probit and Logit: A Comparison of Choice Models for Voting Research, by Jay K. Dow and James W. Enderby

8. Event duration models
Reading:
-Teachman, Jay D. and Mark D. Hayward. 1993. “Interpreting Hazard Rate
Models.” Sociological Methods and Research 21(3):340-371.

9. Censored models
Reading:
-Sigelman, Lee and Langche Zeng. 1999. “Analyzing Censored and Sample-Selected
Data with Tobit and Heckit Models.” Political Analysis 8(2):167-182.
-Dubin, Jeffery A. and Douglas Rivers. 1989. “Selection Bias in Linear Regression,
Logit and Probit Models.” Sociological Methods and Research 18(2-3): 360-390.

10. Panel Data Analysis in GLM and Wrap Up.
Reading:
-Faraway, Chapter 6.