Reto Wüest is a postdoctoral researcher in the Department of Comparative Politics at the University of Bergen. He holds a PhD from the University of Geneva. Reto’s methodological research focuses on measurement and the leveraging of machine learning techniques to improve the prediction accuracy of multilevel regression and post-stratification models. His substantive research focuses on political representation and legislative politics in Europe and the US. Reto has been teaching machine learning courses at the University of Geneva, the Barcelona Summer School in Survey Methodology, and the University of Bergen.

Course content
Machine learning refers to the automated detection of meaningful patterns in data. Not only are machine learning techniques ubiquitous today in that they are behind many of the technologies we use in our daily lives, but they have also become an important part of the social scientist’s toolkit, especially since more and more social science data are now available in electronic form. A common feature of all applications of machine learning is the use of computer algorithms that can “learn” and adapt. The goal of this course is to introduce participants to a variety of widely used supervised and unsupervised machine learning methods. After discussing the fundamental concepts of machine learning, the course continues with supervised learning methods (e.g., regularization, tree boosting, and support vector machine), then turns to unsupervised learning methods (e.g., principal components analysis and clustering methods), and lastly covers flexible model averaging techniques (e.g., Bayesian model averaging).

Course objectives
Participants will gain a solid understanding of a number of widely used and powerful machine learning methods. By the end of the course, they should be able to apply all the methods covered in class using the R software environment. They should also be able to explain the logic underlying the different methods, to understand the respective strengths and weaknesses of these methods, and to interpret their results.

Course prerequisites
Participants should have a solid understanding of probability theory and regression analysis. They should also be familiar with basic programming in R.

Background knowledge/skills required:

OLS: Moderate

Maximum Likelihood: Elementary

R: Moderate

Course Outline

Week 1

Day 1 – Statistical Learning Goals and challenges of machine learning; supervised and unsupervised learning; regression and classification problems; bias-variance trade-off and the problem of overfitting.

Literature:

James et al. 2013. Chapter 2.

 

Day 2 – Resampling Methods Model assessment and selection; cross-validation (k-fold cross-validation, leave-one-out cross-validation); bootstrapping.

Literature:

James et al. 2013. Chapter 5.

 

Day 3 – Linear Methods for Regression Linear regression and least squares; subset selection (best subset selection, stepwise selection); shrinkage methods (ridge regression, lasso, elastic net).

Literature:

James et al. 2013. Chapter 3 and Chapter 6 (6.1-6.2, 6.5-6.6).

 

Day 4 – Linear Methods for Classification Logistic regression; linear discriminant analysis, k-nearest neighbors.

Literature:

James et al. 2013. Chapter 4.

 

Day 5 – Moving Beyond Linearity Polynomial regression; piecewise-constant regression; regression and smoothing splines; local regression; generalized additive models.

Literature:

James et al. 2013. Chapter 7.

 

Week 2

Day 6 – Tree-Based Methods Single decision trees; tree pruning; bagging; random forests; tree boosting.

Literature:

James et al. 2013. Chapter 8.

 

Day 7 – Support Vector Machines Maximal margin classifier; support vector classifier; support vector machine.

Literature:

James et al. 2013. Chapter 9.

 

Day 8 – Principal Components Analysis Principal components analysis; interpretation of principal components; principal components regression.

Literature:

James et al. 2013. Chapter 10 (10.1-10.2, 10.4) and Chapter 6 (6.3, 6.7).

 

Day 9 – Clustering Methods K-means clustering; k-medoids clustering; hierarchical clustering.

Literature:

James et al. 2013. Chapter 10 (10.3, 10.5).

 

Day 10 – Model Averaging Committees; stacking; Bayesian model averaging.

Literature:

Montgomery, Jacob M., Florian M. Hollenbach and Michael D. Ward. 2012. “Improving Predictions using Ensemble Bayesian Model Averaging.” Political Analysis 20(3): 271-291.