Please note: This course will be taught online only. In person study is not available for this course. 

Alejandro Quiroz Flores is Professor of Government and Chief Scientific Adviser at the Institute for Analytics and Data Science (IADS), University of Essex. He obtained his PhD in Politics at New York University. He specializes in Econometrics, Machine Learning, and Political Economy. His work has appeared at Global Environmental Change, British Journal of Political Science, Political Science Research and Methods, and International Studies Quarterly, among others. He is the author of Survival Analysis. A New Guide for Social Scientists, by Cambridge University Press (2022).

Course Content:

This course will cover the statistical concepts and techniques that are used to model time. These models are also known as survival or event history models, and we use them to analyze the duration of time until some event happens—the termination of civil wars, the completion of a medical treatment, or the loss of political office, among other events. The course promotes multistate models as a unified framework for survival analysis. It also relies on the R Language and Environment for Statistical Computing for estimation and analysis.

The course will be divided into two main sections:

  1. Continuous Time Duration Models: We will examine parametric duration models (exponential, Weibull, log-logistic, generalized gamma, etc.), semi-parametric duration models (Cox model) and non-parametric estimates such as the Kaplan-Meier estimator. In addition to exploring how these models are estimated and interpreted, we will also look at typical problems in estimation and various residual-based diagnostic tests.
  2. Discrete Time Duration Models and Advanced Techniques: We will look at the connection between discrete time duration models and binary time-series-cross-section models. We will examine various ways to deal with time dependence, ongoing events, multiple events, and time varying covariates. In the section for advanced techniques, we will examine models for competing risks, heterogeneity and frailty, and repeated events.

 

Course Objectives

The central objective of this course is to learn how to identify, and correctly apply, the statistical techniques appropriate to answering questions relating to time and duration.  Students will be able to identify and classify data problems in survival analysis, define the appropriate survival function, distribution function, hazard function, relative hazard, and cumulative hazard, as well as summarize and interpret analyses of survival data using various estimators. By the end of the course, students should be quite adept at programming R to estimate and interpret a wide variety of different duration models.

Course prerequisites:

Students should already have some experience with the theory behind Maximum Likelihood Estimation.  Some knowledge of basic calculus (differentiation and integration), exponents and logarithms, and R code will be helpful.

Required Text: (will be provided by ESS):

Alejandro Quiroz Flores. 2022. Survival Analysis. A New Guide for Social Scientists. Cambridge: Cambridge University Press.

Box-Steffensmeier, Janet M. and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press.

Background knowledge required
Statistics
Calculus = moderate
Linear regression = strong
OLS = strong
Maximum Likelihood = moderate

Computer Background
R= moderate

Day 1:

Introduction and preliminaries. Survival models and the analysis of time. Key probability concepts. Continuous time duration models.  Parametric models: exponential, Weibull, log-logistic, generalized gamma, etc.  Estimation and interpretation.

Exercise:

  • Identifying duration data in R.
  • Correctly setting survival time in R.
  • Estimation and interpretation of parametric duration models.
  • Model selection.

Reading:

  • Alejandro Quiroz Flores. 2022. Survival Analysis. A New Guide for Social Scientists. Cambridge: Cambridge University Press. Ch. 1.
  • Box-Steffensmeier, Janet M. and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press. Ch. 2-3.

 

Day 2:

Continuous time duration models. Non-parametric estimators. The Cox semi-parametric model.  Dealing with tied data.  Obtaining baseline hazard and survivor functions.

Exercise:

  • The Kaplan-Meier, Nelson-Aalen, and Aalen-Johansen estimators.
  • Estimation and interpretation of the Cox model.
  • Quantities of interest and marginal effects.

Reading:

  • Alejandro Quiroz Flores. 2022. Survival Analysis. A New Guide for Social Scientists. Cambridge: Cambridge University Press. Section. 5.3.
  • Box-Steffensmeier, Janet M. and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press. Ch. 4.

 

Day 3:

Continuous time duration models. Typical problems in estimation. Proportionality. Diagnostic tests and solutions.

Exercise:

  • Diagnostic tests and solutions.

Reading:

  • Alejandro Quiroz Flores. 2022. Survival Analysis. A New Guide for Social Scientists. Cambridge: Cambridge University Press. Ch. 2.
  • Box-Steffensmeier, Janet M. and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press. Ch. 8.
  • Gandrud, Christopher. 2015. “simPH: An R Package for Illustrating Estimates from Cox Proportional Hazard Models Including for Interactive and Nonlinear Effects.” Journal of Statistical Software 65 (3): 1-20.
  • Licht, Amanda. 2011. “Change Comes with Time: Substantive Interpretation of Nonproportional Hazards in Event History Analysis.” Political Analysis 19 (2): 227-243.
  • Park, Sunhee, and David J. Hendry. 2015. “Reassessing Schoenfeld Residual Tests of Proportional Hazards in Political Science Event History Analyses.” American Journal of Political Science 59 (4): 1072-1087.

 

Day 4:

Discrete time duration models. The structure of survival data. The modeling of duration dependence in discrete models. Time-varying covariates. Complex spells of time.

Exercise:

  • Working with discrete survival data.
  • Estimation and interpretation.
  • Types of spells of time.

Reading:

  • Box-Steffensmeier, Janet M. and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press. Ch. 5, 7.
  • Beck, Nathaniel, Jonathan N. Katz, and Richard Tucker. 1998. “Taking Time Seriously: Time-Series Cross-Section with a Binary Dependent Variable.” American Journal of Political Science 42: 1260-1288.
  • Przeworski, Adam, Michael E. Alvarez, José Antonio Cheibub and Fernando Limongi. 2000. Democracy and Development: Political Institutions and Well-Being in the World, 1950-1990. New York: Cambridge University Press. Ch. 2.

 

Day 5:

Competing risks models. Unobserved heterogeneity and frailty models. Multiple failures and repeated events models.

Exercise:

  • Estimation and interpretation.

Reading:

  • Alejandro Quiroz Flores. 2022. Survival Analysis. A New Guide for Social Scientists. Cambridge: Cambridge University Press. Ch. 3-5.
  • Box-Steffensmeier, Janet M. and Bradford S. Jones. 2004. Event History Modeling: A Guide for Social Scientists. New York: Cambridge University Press. Ch. 10.