Please note: This course will be taught in hybrid mode. Hybrid delivery of courses will include synchronous live sessions during which on campus and online students will be taught simultaneously.


Rob Johns is a Professor in Politics in the Department of Government at the University of Essex. Rob’s research is in the fields of public opinion, political psychology and questionnaire design. He has run a number of major survey projects and published numerous books and articles based on analyses of public opinion data. Current research explores the connections between mental health and political attitudes, the drivers of support for Scottish independence, and the value that citizens attach to truth in politics.


Course Content
This course provides an introduction to statistics for social science data analysis. We begin with key concepts – means, deviations, distributions, confidence intervals, and so on – and then move on to the core statistical methods, covering crosstabulation, t-tests, analysis of variance, correlation, and various forms of regression. There is a particular emphasis on analysis of survey data, including survey experiments – however, this remains primarily a course in statistical analysis rather than issues in survey methodology (like sampling or questionnaire design). The methods covered will be demonstrated using the computer package SPSS and a variety of example datasets.  However, students are welcome to use other software and they can also bring their own datasets on which to practise the different methods.

Course Objectives
Participants will become adept in using a wide range of statistical methods for analysing survey data.  These methods are widely used by both academic and professional researchers in a wide range of fields: political science, sociology, psychology, health sciences, sports science, marketing, and so on.  Participants will also acquire a good working knowledge of SPSS, the most commonly used package among survey researchers, although support will be given for those wishing to use other software.  In addition to boosting participants’ current skills, this course also serves as a springboard for the study of more advanced statistical methods – many of which are available in later sessions at the Summer School.

Course Prerequisites
This is an INTRODUCTORY course.  Participants are not required or assumed to have anything more than basic mathematics. There will also be a full introduction to SPSS, the computer program that we will use.

Representative Background Reading
Since this is an introductory course, participants are not required to do any prior reading.  However, those a bit nervous about confronting statistics may benefit from a quick look at the gentle introduction provided by:

Salkind, N. J. and Frey, B. B., 2019. Statistics for People Who (Think They) Hate Statistics (7th edn.), Thousand Oaks, CA: Sage.

Day 1: Variables: The basics

Lecture: research questions; hypotheses; independent and dependent variables; correlation and causation; prior and intervening variables; interactions; levels of measurement

Lab: what SPSS looks like; how to navigate around it; simple tables 

Day 2: Descriptive statistics and distributions

Lecture: measure of central tendency; measures of dispersion (standard deviation, variance); frequency distributions; normal, skewed and bimodal distributions; why distributions matter; the sampling distribution

Lab: obtaining measures of central tendency and dispersion; graphing distributions; entering survey data

Day 3: Probability, hypotheses and significance testing

Lecture: probability and the area under the normal curve; standardising scores; comparing normally-distributed variables; the logic of inferential statistics; standard errors; confidence intervals; hypothesis-testing; the concept of statistical significance

Lab: calculating standardised scores, standard errors and confidence intervals; testing distributions for normality

Day 4: Testing differences between means I: t-tests

Lecture: comparing means across two groups; t-tests for independent samples; t-tests for dependent samples; effect sizes.

Lab: t-tests in SPSS; recoding variables 

Day 5: Testing differences between means II: ANOVA

Lecture: comparing means across multiple groups; one-way ANOVA; post-hoc tests; effect sizes; multi-way ANOVA; interpreting and illustrating interactions.

Lab: ANOVA in SPSS; illustrating interactions

Day 6: Correlation and bivariate regression

Lecture: the idea of correlation; the Pearson correlation; correlation and prediction; scatterplots and lines of best fit; regression equations; residuals; accuracy of prediction and R2; standardising regression coefficients

Lab: correlations and regression in SPSS; scatterplots and lines of best fit; dealing with non-linear relationships

Day 7: Multiple regression

Lecture: why multivariate regression; collinearity and multicollinearity; partial coefficients; categorical and dummy variables in regression; multiple regression equations; comparing coefficients; model specification, parsimony and R2; re-specifying models

Lab: multiple regression in SPSS; recoding categorical variables; collinearity diagnostics

Day 8: Crosstabulation and measures of association

Lecture: reading crosstabs; the chi-squared statistics; measures of association; introducing layer variables; reading interactions from crosstabs

Lab: crosstabs in SPSS; cell options; chi-square tests; obtaining measures of association

Day 9: Logistic regression

Lecture: why avoid linear regression with dummy dependent variables; from predicted values to predicted probabilities; odds and log odds; interpreting logistic coefficients; odds ratios; model fit, pseudo-R2 and improved prediction; extensions to binary logit

Lab: logistic regression in SPSS; interpreting the results; post-estimation options 

Day 10: Data reduction, scaling and reliability

Lecture: why data reduction; exploratory versus confirmatory approaches; principal components analysis and factor analysis; extraction and rotation; interpreting factors; types of scaling; developing and evaluating scales; measures of reliability

Lab: PCA and factor analysis in SPSS; illustrating dimensionality; saving factor scores; obtaining measures of reliability