Tobias

Tobias Böhmelt is a Reader (Associate Professor) in the Department of Government at the University of Essex (UK) and a Research Associate of the International Political Economy Group at the Center for Comparative and International Studies (CIS) as well as the Institute for Environmental Decisions (IED). His main research and teaching interests are the quantitative analysis of conflict and cooperation, environmental politics, international mediation, military effectiveness, and network analysis.

Course Content: This course offers an application-oriented introduction to maximum likelihood (ML) based models for categorical, discrete choice, and count data. We begin with the basics of ML estimation and a discussion of the theoretical foundations of categorical, discrete choice, and count-data models. We then focus on exploring logistic and probit regression models and learn how to apply them in the statistical software package Stata. Afterwards, we cover interpretation and hypothesis testing for these kinds of models. Against this background, we will consider more complicated estimation strategies, including ordered logit and probit regression models, multinomial logits, count models, or discrete-duration models. The course concludes with an overview of advanced techniques of models for time-series cross-section (TSCS) categorical, discrete choice, and count data.

Course Objectives: After this course, participants will be able to understand models for categorical, discrete choice, and count data most commonly used in the social sciences, and properly apply and interpret these models in their own work.

Course Prerequisites: Participants are assumed to have some background in statistical inference, and a basic knowledge of multiple linear regression, including hands-on experience estimating regression models with statistical software.

Representative Background Readings
Gujarati, Damodar N., and Dawn C. Porter. 2009. Essentials of Econometrics. Fourth Edition. New York: Irwin/McGraw-Hill.
Kohler, Ulrich, and Frauke Kreuter. 2012. Data Analysis Using Stata. Third Edition. College Station, TX: Stata Press.
Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications.
Long, Scott J., and Jeremy Freese. 2014. Regression Models for Categorical Dependent Variables Using Stata. Third Edition. College Station, TX: Stata Press.

Required Reading
Train, Kenneth E. 2009. Discrete Choice Models with Simulation. Second Edition. Cambridge: Cambridge University Press.

During the course, we will be using Stata (www.Stata.com/) as our statistical package. You can also find more information about Stata at www.ats.ucla.edu/stat/Stata/. There are various books on Stata that you might find helpful. For instance:

Long, Scott J., and Jeremy Freese. 2014. Regression Models for Categorical Dependent Variables Using Stata. Third Edition. College Station, TX: Stata Press.

1. Day: Introduction
Content
Preliminaries, introduction to Stata, replication, do-files, log-files, terminology, theories of discrete choice as well as models for categorical and count data, and the linear probability model.

Readings
King, Gary. 1995. Replication, Replication. PS: Political Science and Politics 28: 443-499

King, Gary, Michael Tomz, and Jason Wittenberg. 2000. Making the Most of Statistical Analyses: Improving Interpretation and Presentation. American Journal of Political Science 44: 341-355.

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications, Chapters 1, 2 (except Section 2.6), and Section 3.1.

Nagler, Jonathan. 1995. Coding Style and Good Computing Practices. The Political Methodologist 6: 2-8.

Train, Kenneth E. 2009. Discrete Choice Models with Simulation. Cambridge: Cambridge University Press, Chapters 1, 2.

2. Day: Maximum Likelihood Estimation
Content
Maximizing log-likelihood functions, hypothesis testing and goodness of fit, regression via maximum likelihood estimation, manual application of maximum likelihood estimation to regression.

Readings
Gould, William, Jeffrey Pitblado, and William Sribney. 2003. Maximum Likelihood Estimation With Stata. Second Edition. College Station, TX: Stata Press. Chapters 2, 3. Greene, William H. 2008. Econometric Analysis. Sixth Edition. Upper Saddle River, NJ: Prentice Hall. Chapter 16, Appendix E.1-E.4.

King, Gary. 2001. Unifying Political Methodology: The Likelihood Theory of Statistical Inference. Ann Arbor, MI: University of Michigan Press. Chapters 1-4.

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications, Sections 2.6, 3.5, 3.6, and Chapter 4.

Train, Kenneth E. 2009. Discrete Choice Models with Simulation. Cambridge: Cambridge University Press, Chapter 8.

3. and 4. Day: Binary Dependent Variables – Logit, Probit, Scobit, Heteroskedastic Probit, Rare Events Logit
Content
Deriving the logit and probit models, identification assumptions, and alternative distributional assumptions such as scobit and heteroskedastic probit, goodness of fit measures such as pseudo-R2 and percent correctly predicted, interpretation and quantities of interest such as predicted probabilities, first differences, marginal effects, confidence intervals, hypothesis testing such as Wald test, likelihood ratio test, interaction terms, and rare-events data.

Readings

Greene, William H. 2008. Econometric Analysis. Sixth Edition. Upper Saddle River, NJ: Prentice Hall. Sections 23.1-23.4.

Herron, Michael. 1999. Post-Estimation Uncertainty in Limited Dependent Variable Models. Political Analysis 8: 83-98.

King, Gary, and Langche Zeng. 2001a. Logistic Regression in Rare Events Data. Political Analysis 12: 137-163.

King, Gary, and Langche Zeng. 2001b. Explaining Rare Events in International Relations. International Organization 55: 693-715.

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications, Sections 3.2, 3.3, 3.4, 3.7, 3.8.

Nagler, Jonathan. 1994. Scobit: An Alternative Estimator to Logit and Probit. American Journal of Political Science 38: 230-255.

Train, Kenneth E. 2009. Discrete Choice Models with Simulation. Cambridge: Cambridge University Press, Chapters 3, 4, 5 (except Section 5.5)

5. and 6. Day: Multichotomous Dependent Variables
Content
A general framework for models for categorical data and discrete choice, multinomial and conditional logit models, identification assumptions, estimation and interpretation, random taste variation and independence of irrelevant alternatives, nested logit and multinomial probit models, random coefficient models and mixed logit models, simulated maximum likelihood.

Readings
Alvarez, R. Michael, and Jonathan Nagler. 1998. When Politics and Models Collide: Estimating Models of Multi-Party Elections. American Journal of Political Science 42: 55-96.

Glasgow, Garrett. 2001. Mixed Logit Models for Multiparty Elections. Political Analysis 9: 116-136.

Greene, William H. 2008. Econometric Analysis. Sixth Edition. Upper Saddle River, NJ: Prentice Hall. Chapter 17, Section 23.11.

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications, Chapter 6.

Martin, Lanny W., and Randolph T. Stevenson. 2001. Government Formation in Parliamentary Democracies. American Journal of Political Science 45: 33-50.

Train, Kenneth E. 2009. Discrete Choice Models with Simulation. Cambridge: Cambridge University Press, Chapters 6, 9, 10.

7. Day: Ordered Dependent Variable Models
Content
Ordered logit and probit models, identification, assumption of parallel regression, generalized logit, continuation ratio model, cut-points, interpretation, and quantities of interest such as predicted probabilities, first differences, marginal effects, and confidence intervals, and the heteroskedastic ordered probit.

Readings
Gelpi, Christopher. 1997. Crime and Punishment: The Role of Norms in Crisis Bargaining. American Political Science Review 91: 339-360.

Greene, William H. 2008. Econometric Analysis. Sixth Edition. Upper Saddle River, NJ: Prentice Hall. Section 23.10.

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications, Chapter 5.

Train, Kenneth E. 2009. Discrete Choice Models with Simulation. Cambridge: Cambridge University Press, Chapter 7.

8. Day: Count Models
Content
Poisson model: estimation and interpretation, exposure, underdispersion, overdispersion, and mean-variance equality, negative binomial models, continuous parameter binomial models, and generalized event count models, censored and truncated data, zero-inflated count models.

Readings
King, Gary. 1989a. Variance Specification in Event Count Models: From Restrictive Assumptions to a Generalized Estimator. American Journal of Political Science 33: 762-784.

King, Gary. 1989b. Event Count Models for International Relations: Generalizations and Applications. International Studies Quarterly 33: 123-147.

Greene, William H. 2008. Econometric Analysis. Sixth Edition. Upper Saddle River, NJ: Prentice Hall. Sections 25.1-25.5.

Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage Publications, Chapter 8.

Zorn, Christopher. 1998. An Analytic and Empirical Examination of Zero-Inflated and Hurdle Poisson Specifications. Sociological Methods and Research 26: 368-400.

9. and 10. Day: Models for Repeated Observations: Panel and Time-Series Cross-Section Categorical Data
Content
Connection to time-series cross-section models with binary dependent variables, time dependence including robust clustered standard errors, temporal dummies, and cubic splines, ongoing events and second spells, time varying covariates, and Markov transitions models.

Readings
Alt, James E., Gary King, and Curtis S. Signorino. 2000. Aggregation among Binary, Count, and Duration Models: Estimating the Same Quantities from Different Levels of Data. Political Analysis 9: 21-44.

Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. Taking Time Seriously: Time-Series-Cross-Section Analysis with a Binary Dependent Variable. American Journal of Political Science 42: 1260-1288.

Box-Steffensmeier, Janet M., and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. Cambridge: Cambridge University Press. Chapters 5, 7.

Box-Steffensmeier, Janet M., and Christopher Zorn. 2002. Duration Models for Repeated Events. Journal of Politics 64: 1069-1094.

Carter, David B., and Curtis Signorino. 2010. Back to the Future: Modeling Time Dependence in Binary Data. Political Analysis 18: 271-292.

Oneal, John, and Bruce Russett. 1999. Assessing the Liberal Peace with Alternative Specifications: Trade Still Reduces Conflict. Journal of Peace Research 36: 423-442.

Train, Kenneth E. 2009. Discrete Choice Models with Simulation. Cambridge: Cambridge University Press, Section 5.5.

Wooldridge, Jeffrey M. 2002. Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press. Sections 15.1-15.7, 15.9, 15.10.

Zorn, Christopher. 2000. Modeling Duration Dependence. Political Analysis 8: 367-380.