plambert

Paul Lambert is a Professor of Sociology at the University of Stirling, UK, where he teaches courses on research methods and on social stratification. His research and publications cover methodological topics in social survey data analysis and data management (with a particular interest in handling data on occupations and on ethnicity), and substantive studies into processes of social stratification and social inequality.

Course Content: Social science data often features the ‘clustering’ or ‘hierarchical nesting’ of individual cases within larger units of analysis – for example in household surveys, when there may be several individual responses clustered within the same household. Multilevel models are statistical models which provide analytical tools for dealing with data of this nature. They provide a convenient means to undertake regression analysis which takes account of, and can help to summarise, patterns of clustering. As such multilevel models are an important tool in social statistics, and are of potential relevance to almost any study using complex social data.

This course provides an applied introduction to multilevel modelling for social science datasets. It will introduce the statistical features of multilevel models, deal with approaches to handling data which has clustered or hierarchical elements, and provide training in specifying multilevel models for linear and categorical outcome variables in a variety of survey data scenarios. When social scientists refer to multilevel models, they normally mean a specification that is also known as the ‘random effects’ model. This course concentrates upon random effects models, but also features many points of comparison with other modelling devices that can be used for clustered or nested data. The course also emphasises the practical application of multilevel models, and seeks to convey both the attractions and limitations of a multilevel modelling approach.

Course Objectives: The course seeks to provide participants with a solid grounding in the application of multilevel models. This involves combining a strong understanding of how multilevel models are formulated in statistical terms (and their relationship to other types of statistical model), with a fluency in handling data with clustered and hierarchical features and an ability to specify multilevel models in popular statistical analysis packages.

The course will feature lab sessions with command files which illustrate handling data and specifying multilevel models in several software packages. Most often, lab examples use the Stata package, since that software features a wide range of options both for handling complex data, and for specifying multilevel models. Selected examples are also given in other packages, including SPSS, the freeware R, and MLwiN (a specialist software, designed explicitly for estimating multilevel models). Worked examples will be available in these packages using several different, often large scale, social survey datasets: this is an ambitious objective which seeks to provide participants with important operational skills which are not widely taught.

There are a number of benefits to studying the practical application of multilevel models. Firstly, multilevel models are important devices for exploring the character of clustered or hierarchical structures within a dataset (for example, to compare the scale of pupil-level and class-level influences in an educational study which features pupils clustered within classes). Secondly, they are often used simply to control for hierarchical structural features within data (that is, when a pattern of clustering is not substantively important, but does need to be controlled for). Finally, a thorough introduction and review of the practical implementation of multilevel models also serves as an effective means of refreshing understanding of the implementation and interpretation of statistical models in the social sciences more generally.

Course Prerequisites: This is an introductory course, but participants will benefit from having moderate levels of previous statistical training and previous experience in using statistical software (see descriptions below). Prior to attending the course, all participants are encouraged to read a research article that uses a multilevel model, and an introductory article or chapter on the methodology (suggestions given below).

The course is suitable for participants who have received statistical training at least to the level of understanding the application of conventional regression modelling approaches (e.g. multiple regression and logistic regression), and who are fluent in popular descriptive analytical techniques and the statistical tests behind them (e.g. chi-square tests; correlation values). Most participants are likely to benefit from preparatory study or revision of materials which cover generating and interpreting the outputs from conventional regression analyses, such as on coefficient effects and indicators of model fit (e.g. Allison 1999; Tarling 2009). The course will take conventional regression models as its starting point, and build onwards to multilevel models and other related extensions topics in statistical modelling.

The course is best suited to participants with at least some previous experience in using statistical software packages for social science data analysis. The course features lab materials spanning several packages (Stata, SPSS, R and MLwiN, with Stata used most often) and using several different social science datasets. Previous exposure to the ‘syntax’ languages of these packages will be an advantage, since the practical materials involve programming in these languages. The course should be accessible to people who have little previous experience in this area, since background materials on the software packages will be made available, but students without some background in the programming of software using syntax should be prepared that extra effort will probably be required during the opening days of the course in order to follow the lab exercises. A note on software is prepared for the course that discusses and illustrates how to use the packages involved (supplied as an appendix to the coursepack). Participants are asked to read through this note in the opening days of the course, and may need to spend additional study time in improving their software skills at this point if relevant.

Software used (and suggested introductory online information):
 Stata (http://www.ats.ucla.edu/stat/stata/)
 SPSS (http://www.spss-tutorials.com/)
 MLwiN (http://www.bristol.ac.uk/cmm/software/mlwin/)
 R (http://www.ats.ucla.edu/stat/r/)

We stress that not every example is available in every package. Stata will be used much more than the other packages. Coverage of R is limited to selective illustrations.

Representative Background Reading

1) Background on modelling social science data:

Allison, P. D. (1999). Multiple Regression: A primer. London: Sage.
Long, J.S., & Freese, J. (2014). Regression Models for Categorical Dependent Variables Using Stata, 3rd Edition. College Station, Tx: Stata Press. (See chpts 2-4).
Menard, S. (2001). Applied Logistic Regression Analysis, Second Edition. Berkley, Ca: Sage.
Tarling, R. (2009) Statistical Modelling for Social Researchers: Principles and practice. London: Routledge.
Treiman, D. J. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. New York: Jossey Bass.

2) Introductions to multilevel models:

Bickel, R. (2007). Multilevel Analysis for Applied Research: It’s Just Regression! New York: The Guilford Press.
Plewis, I. (1998). Multilevel Models. Social Research Update, 23, http://sru.soc.surrey.ac.uk/SRU23.html.
Robson, K., & Pevalin, D. (2016). Multilevel Modeling in Plain Language. London: Sage.
Snijders, T. A. B., & Bosker, R. J. (2011). Multilevel Analysis: An introduction to basic and advanced multilevel modelling, 2nd Edition. London: Sage.
Tarling, R. (2009) Statistical Modelling for Social Researchers: Principles and practice. London: Routledge (C.9).

3) Illustrative research articles which use multilevel modelling:

Andersen, R., Yang, M., & Heath, A. F. (2006). Class Politics and Political Context in Britain, 1964-1997: Have Voters Become More Individualised? European Sociological Review, 22(2), 215-228.
Jen, M. H., Jones, K., & Johnston, R. J. (2009). Compositional and contextual approaches to the study of health behaviour and outcomes: Using multi-level modelling to evaluate Wilkinson’s income inequality hypothesis. Health and Place, 15, 198-203.
Maas, I., & Zijdeman, R. L. (2010). Beyond the local marriage market: The influence of modernization on geographical heterogamy Demographic Research, 23(33), 933-962.
Rasbash, J., Leckie, G., Pillinger, R., & Jenkins, J. (2010). Children’s educational progress: Partitioning family, school and area effects. Journal of the Royal Statistical Society Series A, 173, 657-682.
Verbakel, E. (2013). Leisure values of Europeans from 46 countries. European Sociological Review, 29(3), 669-682.

Required texts

The following text will be included in the coursepack and used throughout the course:

Hox, J. (2010). Multilevel Analysis: Techniques and Applications, Second Edition. London: Routledge.

For Stata users, we also recommend accessing or purchasing the following:

Rabe-Hesketh, S. and Skrondal, A. (2008/12) Multilevel and Longitudinal Modelling Using Stata, Second Edition/3rd Edition (2 volume set). College Station, Tx: Stata Press.

  • Lectures (L) / Computer practicals (P) / Readings (R)

    Day 1 Monday 10 July – Introduction to the idea of multilevel modelling (i)
    L1a: The idea of multilevel modelling
    L1b: Course arrangements and overview
    P1: Getting started with multilevel software
    R: Hox (2010: preface & c1)

    Day 2 Tuesday 11 July – Introduction to the idea of multilevel modelling (ii)
    L2a: From single level models to multilevel models
    L2b: Multilevel data structures and examples
    P2: Exploring and summarising multilevel data and key elements of statistical modelling
    R: Hox (2010: c1 & c2)

    Day 3 Wednesday 12 July – Multilevel applications with linear outcomes (i): Two-level random intercepts models
    L3a: The two-level random intercepts model
    L3b: Interpreting random intercepts and their residuals
    P3: Two-level random intercepts model examples
    R: Hox (2010: c2 & c3)

    Day 4 Thursday 13 July – Multilevel applications with linear outcomes (ii): Random intercepts and slops
    L4a: The two-level random slopes model
    L4b: Interpreting random intercepts and slopes
    P4: Models for random intercepts and slopes
    R: Hox (2010: c2 & c3)

    Day 5 Friday 14 July – Multilevel applications for binary and other categorical outcomes
    L5a: Multilevel models for binary outcomes
    L5b: Multinomial, ordered and count outcomes
    P5: Categorical outcomes in multilevel models
    R: Hox (2010: c6, c7)

    Day 6 Monday 17 July – Multilevel techniques in context
    L6a: Research example (title tbc)
    L6b: Do we always need multilevel models?
    P6: Alternative statistical treatments for clustered data; testing and comparing hierarchical effects R: Di Prete and Forristal (1994)

    Day 7 Tuesday 18 July – Multilevel models with more than two levels
    L7a: Hierarchical effects at three and more levels
    L7b: Cross-classified and multiple membership designs
    P7: Data and models with complex clustering
    R: Hox (2010: c2.4; c9)

    Day 8 Wednesday 19 July – Special cases of multilevel modelling
    L8a: The relationship between multilevel models and structural equation models
    L8b: Special examples of multilevel models for longitudinal data analysis
    P8: Examples in analysing longitudinal data; Using SEMs as multilevel models
    R: Hox (2010: c5; c8; c15)

    Day 9 Thursday 20 July – Multilevel modelling applications
    L9a: Class plenary: Selected hierarchical data and models
    L9b: Option: Review/Questions/Selected recap topics
    P9: Practical applications – extension topics
    R: Hox (2010: c10-16; Browne et al. 2012)

    Day 10 Friday 21 July – Review session
    L10a: Next steps in advanced multilevel analysis
    L10b: The contribution of multilevel modelling
    P10: No new materials: Lab review/recap opportunity

    Software used (suggested introductory online information)

    Stata (http://www.ats.ucla.edu/stat/stata/) 
    SPSS (http://www.spss-tutorials.com/) 
    MLwiN (http://www.bristol.ac.uk/cmm/software/mlwin/) 
    R (http://www.ats.ucla.edu/stat/r/)

    Stata will be used most frequently. Many, but not all, exercises will also be available in SPSS, R and/or MLwiN. Prior knowledge of the above packages is not assumed but previous exposure to syntax programming in at least one of them will be beneficial, since the software examples in the course use ‘syntax’ modes of operation. A note on software is prepared for the course that discusses and illustrates how to use the packages involved (supplied as an appendix to the coursepack). Participants are asked to read through this note in the opening days of the course, and may need to spend additional study time in improving their software skills at this point if relevant.
    Recommended References

    • The text by Hox (2010) will be supplied with the course-pack. The text by Rabe-Hesketh and Skrondal (2008/2012) is also recommended for purchase (for Stata users) if possible. There are several other texts with a wide range of materials on multilevel modelling that could be useful before and during the module. Further reading recommendations will also be made during the course.

    Bickel, R. (2007). Multilevel Analysis for Applied Research: It’s Just Regression! New York: The Guilford Press.

    Heck, R. H., Thomas, S. L., & Tabata, L. N. (2013). Multilevel and Longitudinal Modeling with IBM SPSS, Second Edition. London: Routledge.

    Hox, J. (2010). Multilevel Analysis, 2nd Edition. London: Routledge. [ISBN: 9781848728462]

    Luke, D. A. (2004). Multilevel Modeling. London: Sage.

    strong>Rabe-Hesketh, S., & Skrondal, A. (2008/2012). Multilevel and Longitudinal Modeling Using Stata, Second Edition/Third Edition. College Station, Tx: Stata Press [ISBN: 9781597180405, single-volume 2nd ed; 9781597181082, 2-volume set, 3rd ed]

    Robson, K., & Pevalin, D. (2016). Multilevel Modeling in Plain Language. London: Sage.

    Snijders, T. A. B., & Bosker, R. J. (2012). Multilevel Analysis: An introduction to basic and advanced multilevel modelling, 2nd Edition. London: Sage.

    • The texts listed below are good options for preparatory reading. They are either short introductions to multilevel modelling, or useful statements on statistical modelling in general. It is desirable but not essential to read one or more of these before the course.

    Allison, P. D. (1999). Multiple Regression: A primer. London: Sage.

    DiPrete, T. A., & Forristal, J. D. (1994). Multilevel Models – Methods and Substance. Annual Review of Sociology, 20, 331-357.

    Long, J.S., & Freese, J. (2014). Regression Models for Categorical Dependent Variables Using Stata, 3rd Edition. College Station, Tx: Stata Press. (Chpts 2-4).

    Menard, S. (2001). Applied Logistic Regression Analysis, Second Edition. Berkley, Ca: Sage.

    Plewis, I. (1994). Longitudinal Multilevel Models. In A. Dale & R. B. Davies (Eds.), Analysing Social and Political Change : A casebook of methods. London: Sage.

    Plewis, I. (1998). Multilevel Models. Social Research Update, 23, http://sru.soc.surrey.ac.uk/SRU23.html.

    Tarling, R. (2009). Statistical Modelling for Social Researchers: Principles and practice. London: Routledge.

    Treiman, D. J. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. New York: Jossey Bass.

    • There are several online depositories of materials covering multilevel modelling, in particular the LEMMA course at: http://www.bristol.ac.uk/cmm/learning/online-course/index.html