Jeremy Miles is a Senior Quantitative Analyst at Google. He is author (with Mark Shevlin) of Applying Correlation and Regression Analysis (Sage. 2001); author of Research Methods and Statistics (Crucial Press. 2001), author (with Phil Banyard) of Understanding and Using Statistics in Psychology (Sage, 2007), co-author (with Andy Field) of Discovering Statistics Using SAS (Sage, 2010), co-editor (with Paul Gilbert) of A Handbook of Research Methods in Clinical and Health Psychology (Oxford, 2005) and co-editor (with Brian Stucky) of Quantitative Research in Psychology [don’t even think about buying this book]. He has served as chair of the British Psychological Society Mathematics, Statistics and Computing Section, and is associate editor of the British Journal of Mathematical and Statistical Psychology; Frontiers in Measurement and Quantitative Psychology, and PLOSOne. ,He also serves as statistical editor of the British Journal of Clinical Psychology, and the British Journal of Health Psychology.
The course will cover the theory and practice of regression analysis in its various forms. Regression models (broadly defined) are models which attempt to use predictors to explain a single outcome variable. This outcome variable may be continuous, ordinal, categorical or discrete counts and the predictors may be interval or categorical. The predictors may be linear, non-linear, or interactive.
Although the focus of the course is applying regression, we will start by looking at the meaning of models in statistics. We will consider the mean, correlation and regression as models, and regression to the mean. We look at describing models, and at statistical significance and confidence intervals (although we expect you to have prior knowledge of these areas, we will refresh them). In the third part we move on to develop more complex models (e.g. hierarchical regression, categorical independent variables), and consider the implications of the assumptions made in regression analysis (including the effect of their violation). We then look at extending regression in different ways: logistic regression, path analysis, interactions and Poisson regression. Throughout the module we will cover examples in Stata, and occasionally use other programs, e.g. GPower for power analysis.
The course will enable participants to carry out a range of regression analyses. It is appropriate for participants who have covered some statistics, and wish to extend their knowledge to modelling more complex social science phenomena. The module provides appropriate background for people who want to go on to modules such as multilevel modelling, probit and logit analysis, or structural equation modelling.
The course covers the basics, but does this quickly, so consider this a refresher. If you are not familiar with basic descriptive and inferential statistics, expect to work hard during this phase. Similarly, while we begin with simple correlation and regression, we will be thinking about these in some (possibly) new ways. We will use Stata – if you’re not familiar with Stata this is not a problem; it’s very straightforward and students pick it up within a few minutes. We shall use Excel a little at the start as well. (If students prefer, they can use R instead.)
We will expect that you have some knowledge of descriptive statistics, statistical significance, correlation, sampling and estimation, and will only cover these things briefly. Any introductory statistics book from your field will cover these issues. One example would be:
Miles, J and Banyard, P (2007). Understanding and using statistics in psychology. London: Sage.
But there are many others that you may be familiar with, which are just as good, or even better.
If you’re not familiar with Stata, a little practice would not hurt, but is not necessary, and the same goes for Excel. (Please feel free to contact me if you would like guidance on what you need to know – Jeremy.firstname.lastname@example.org).
Representative Background Reading
Cohen, J., P. Cohen, et al. 2003. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. (3rd ed.). Erlbaum. (Very long, very thorough, best if your background is psychology).
Miles, J.N.V., and, Shevlin, M. 2001. Applying Regression and Correlation. Sage. (The course closely follows this book – making it worth buying to cover the course, but making it not worth buying, because the material is similar to the handouts – in addition, a second edition will be appearing soon, so perhaps you should save your money).
Pedhazur, E. J. 1997. Multiple Regression in Behavioral Research. Harcourt Brace. (Not everyone likes the style of this book, so have a look before you buy it)
Studenmund, A. H. 2010. Using Econometrics: A Practical Guide. Addison Wesley. (This book focuses on econometrics, which has a slightly different emphasis from that we will take; it’s also ridiculously expensive – older editions are fine if you can get a second hand one).
A couple of more gentle starters are:
Allison, P. 1999. Multiple Regression: A Primer. Pine Forge Press.
Garson, D. (2012). Multiple regression. Statistical Associates Publishers. (This book is only available on Kindle, but it’s $5.)
Different books have different emphases, and we shall be talking about some of these issues in the classes.
Background knowledge required
OLS = e
Stata = e
i = irrelevant, e = elementary, m = moderate, s = strong