Moritz Marbach is a postdoctoral researcher at the Immigration Policy Lab at the ETH Zurich which is the European branch of the Immigration Policy Lab at Stanford University. His research focuses on political methodology and the role of migration in politics. Marbach received his Ph.D. from the University of Mannheim in 2016. Over the last years, he has taught various courses in political methodology and international relations.

Course Content
Building on students’ knowledge in simple linear regression, this course introduces more advanced statistical techniques. We will focus on the following topics: 1) Subclassification, matching, and propensity score weighting as fundamental strategies to reduce confounding; 2) Common interpretations of the OLS estimator; 3) Robust inference with cluster-robust standard errors as well as bootstrap estimators; 4) Classical exposition of various biases and introduction to causal graphs to select control variables; 5) Specification and interpretation of interactions and non-linear effects; 6) Instrumental variable estimation with two-stage least-squares and the Heckman estimator for selected samples; 7) Two-way fixed effects and the difference-in-difference estimator with panel data; 8) Generalized linear models with examples for limited dependent variables and a discussion on structural parameters vs. causal estimands. The course combines a theoretical introduction to each topic with lab sessions in which students perform replications of published work in Economics and Political Science using STATA. Students are encouraged to bring their own data sets and present their research projects and empirical analyses throughout the course.

Course Objectives
Students will develop a deeper understanding of the statistical problems that arise in applied research. It will give participants the skills to i) Make sensible choices about which data to collect; ii) Select an appropriate estimator for their estimands and the data at hand; iii) Precisely state the assumptions on which estimates are based; and iv) Conduct sensible, robust inferences and interpret their estimates accurately. More generally, students will be able to decide which variables they have to control for (and which they do not) and develop an understanding of what it means when regression results are distorted by selection bias, measurement error or omitted variable bias.

Course Prerequisites
This course is targeted at social and political scientists with a strong interest in applied empirical research and data analysis. Participants must be familiar with STATA and its command structure and be able to write their own do-files. R users are also welcome. The course is designed for students with basic training in statistics including linear regression and hypothesis testing. It is also essential that students have basic knowledge of matrix algebra, calculus, and probability theory to follow the lectures.

Core Reading
– Angrist, J. D. and J.-S. Pischke (2009). Mostly Harmless Econometrics. An Empiricist’s Companion. Princeton: Princeton University Press. This book will be provided on arrival to the Summer School as part of the course material for this course.

Representative Background Reading
– Alan Agresti and Barbara Finlay. Statistical Methods for the Social Sciences. Pearson, Upper Saddle River, 4th edition, 2009, Chapter 1-9.

Background knowledge required
OLS = e

Computer Background
Stata = m

e = elementary, m = moderate, s = strong

Course Structure:

Day 1Cause and Effect
Day 2Subclassification, Weighting and Matching
Day 3Perspectives on OLS
Day 4Robust Inference with OLS
Day 5Causal Graphs
Day 6Functional Form Specification
Day 7Instrumental Variables
Day 8DID and Panel Data
Day 9Day 9Generalized Linear Models
Day 10Review and Student Presentations

Essential Course Readings:

Day 1
– Angrist and Pischke (2009), Chapter 1-2.
– Holland (1986).
– Agresti and Finlay (2009), Chapter 5-6.

Day 2

– Cochran (1968).
– Rosenbaum (2009), Chapter 7, 8.1-8-3, 9.

Day 3
– Angrist and Pischke (2009), Chapter 3.
– Wooldridge (2010), Chapter 2.

Day 4
– Wooldridge (2009), Chapter 5, 8.
– Angrist and Pischke (2009), Chapter 8.

Day 5
– Morgan and Winship (2007), Chapter 3.
– Elwert and Winship (2014)

Day 6

– Keele (2008), Chapter 2.1-2.2, 3.1-3.3.
– Hastie et al. (2009), Chapter 3, 5

Day 7
– Angrist and Pischke (2009), Chapter 4.
– Wooldridge (2010), Chapter 5.

Day 8
– Angrist and Pischke (2009), Chapter 5.
– Wooldridge (2010), Chapter 10.

Day 9
– King (1998), Chapter 4, 5.1-5.3
– Wooldridge (2010), Chapter 13.1-13.6, 15.

Day 10
– Freedman (1991)


Agresti, Alan, and Barbara Finlay. 2009. Statistical Methods for the Social Sciences. 4th ed. Upper Saddle River: Pearson.

Angrist, Joshua D., and Jörn-Steffen Pischke. 2009. Mostly Harmless Econometrics. an Empiricist’s Companion. Princeton: Princeton University Press.

Cochran, W. G. 1968. “The Effectiveness of Adjustment by Subclassification in Removing Bias in Observational Studies.” Biometrics 24 (2): 295–313.

F. Elwert and C. Winship. Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable. Annual Review of Sociology, 40:31–53, 2014.

Freedman, David A. 1991. “Statistical Models and Shoe Leather.” Sociological Methodology 21: 291–313.

T. Hastie, R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning. Data Mining, Inference, and Prediction. 2nd ed. Heidelberg: Springer.

Holland, Paul W. 1986. “Statistics and Causal Inference.” Journal of the American Statistical Association 81 (396): 945–60.

Keele, Luke. 2008. Semiparametric Regression for the Social Sciences. Chichester: Wiley.

King, Gary. 1998. Unifying Political Methodology. the Likelihood Theory of Statistical Inference. Ann Arbor: The University of Michigan Press.

Morgan, Stephen L., and Christopher Winship. 2007. Counterfactual and Causal Inference. Methods and Principles for Social Research. Cambridge: Cambridge University Press.

Rosenbaum, Paul R. 2009. Design of Observational Studies. 2nd ed. New York: Springer.

Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. 2nd ed. Cambridge: MIT Press.