2M Machine Learning For Estimating Treatment Effects From Observational Data

Please note: This course will be taught in hybrid mode. Hybrid delivery of courses will include synchronous live sessions during which on campus and online students will be taught simultaneously.

Annalivia Polselli is a British Academy Postdoctoral Fellow at the Institute for Social and Economic Research (ISER), University of Essex. During 2020 to 2024, she held a position as Postdoctoral Fellow at the Institute for Analytics and Data Science (IADS) at University of Essex. She received her Ph.D. in Economics from University of Essex in 2022. Her research interests include econometric methods for panel data models, causal machine learning, and applied economics. Her current work focuses on advancing double machine learning techniques for panel data models.

Damian is a Postdoctoral Scholar at the Causality in Healthcare AI (CHAI) Hub and a Research Associate in Causal AI at the University of Edinburgh, where his work is centred around improving healthcare AI through causal machine learning. He obtained his PhD in Computer Science at the University of Essex and is also a former Software Developer. His main research interests are at the interface of causality and machine learning, with a particular focus on the methods for treatment effect estimation and causal graph learning from observational data, but also the topics of robustness to data shifts, hyperparameters, and performance evaluation.

Course Description

This course offers a comprehensive discussion of various (new and established) machine learning techniques for prediction and causal effect estimation (ATE, ATT, CATE, LATE, HTE) with observational data. The course will cover the use of well-known base learners (e.g., Lasso, decision and boosted trees, random and causal forest, neural networks) for effective causal estimation through meta-learners and doubly/debiased estimation procedures. Note the course puts emphasis on the use of ML base learners in practice rather than explaining their internal design in detail. Best practices in the field will be followed throughout the course, hence the content will also cover how to evaluate obtained models and select among different modelling options.

The course will combine the theory from lectures with practical (hands-on data) sessions in the statistical software R. Practical sessions will use well-established data sets or ad-hoc simulated data to apply the methods presented in the lectures with practical examples.

The main goal of the course is to equip participants with the latest machine learning techniques to conduct data visualisation and causal analysis independently. By the end of the course, the participants will know the challenges that come with observational data and know how to address them through good practice to obtain robust causal estimates.

Prior knowledge of causal estimation is not necessary. Some knowledge of machine learning is recommended but not essential as the topic will be revised at the beginning of the course. However, thorough understanding of statistical modelling is imperative to fully appreciate the course content.

Course Prerequisites

Working knowledge of R (e.g., data management and visualisation)
Basics of statistical modelling (OLS, lasso)
Basics of probability and calculus

Course Objectives

By the end of the course the students will:

Know the basic principles of causal inference and machine learning.
Be aware of advantages as well as challenges that come with observational data.
Understand the role of modelling in causal inference.
Be comfortable with using various machine learning techniques to estimate causal effects.
Know how to better understand obtained estimates through visualisation and evaluation metrics.
Be familiar with the most powerful machine learning methods, including neural networks and generative models, and their use in effect estimation.
Have an in-depth knowledge of the latest state-of-the-art causal estimators, such as double machine learning.
Be confident in applying new skills in practical settings.

Core Reading

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning with Applications in R (Second edition). New York: Springer. ISBN: 978-1-0716-1417-4
Pearl, J., Glymour, M., & Jewell, N. P. (2016). Causal inference in statistics: A primer. John Wiley & Sons. ISBN: 978-1-119-18684-7

Background knowledge required

Maths:

Calculus – elementary

Linear Regression – elementary

Statistics:

OLS – moderate

Computer background:

R – moderate

Course Outline

Day 1 (AP, DM)

- Welcome (course structure, meet your instructors, assessment instructions)
- Correlation vs. causation
- Causal inference from observational data
- Potential outcomes
  - Assumptions
  - Target causal (treatment) parameters (ATE, ATT, LATE, CATE)
- Practical session
  - Google Colab, Kaggle Notebooks, GitHub
  - R for reading and visualising data
  - Description of the data sets used in the course

Day 2 (DM, AP)

- - Introduction to the main principles of machine learning (ML)
  - Supervised ML (regression and classification)
  - Good practices: data splitting, cross-validation, model evaluation
  - Practical in R (learning and prediction, evaluation and metrics, CV)

Day 3 (DM, AP)

- - Traditional causal estimators (ATE)
  - Propensity scores (inverse weighting and doubly-robust methods)
  - Using lasso and simple trees
  - Metrics (ATE bias)
  - Practical in R (simple implementations of IPW/DR, learning with lasso/trees, performance evaluation)

Day 4 (DM, AP)

- - ML for individualised estimates (CATE)
  - ATE vs. CATE vs. ITE
  - Meta-learners used with more complex base learners (e.g. random forest, boosted trees)
  - Inspecting predicted CATEs: visualisation and confidence intervals
  - Metrics (PEHE)
  - Practical in R (simple implementations and external packages, learning and prediction, performance evaluation)

Day 5 (DM, AP)

- - Hyperparameter optimisation (importance and pitfalls)
  - Goodness-of-fit vs. CATE accuracy
  - More advanced metrics (R-loss, plugins)
  - Hyperparameter tuning vs. model selection vs. ensembles
  - Practical in R (previous exercises revisited but now with tuning, using advanced metrics)

Day 6 (DM, AP)

- - Neural networks
    - Simple architectures as base learners (MLP)
    - Advanced standalone estimators
  - Generative modelling
    - Auto-encoding and adversarial networks
    - Tree-based approaches
    - Data imputation and augmentation
  - Practical in R (using implementations in Tensorflow/Keras, running models on GPUs)

Day 7 (AP, DM)

- - Beyond OLS: Intro to causal Machine Learning
  - Double selection procedures (with and without endogeneity)
  - Practical in R with examples

Day 8 (AP, DM)

- - Basics in Double Machine Learning (DML):
    - Neyman-orthogonality
    - Sample-splitting and cross-fitting
    - Score functions
  - Overview of structural causal models (PLR, PLIV, IRM, IIVRM)
  - Practical in R with examples

Day 9 (AP, DM)

- - DML for panel data (PLR, DID)
  - Practical in R with examples

Day 10 (AP, DM)

- - Generalized Random Forests for heterogenous treatment effects (HTE)
  - Practical in R with examples