Please note: This course will be taught in hybrid mode. Hybrid delivery of courses will include synchronous live sessions during which on campus and online students will be taught simultaneously.
Paul Lambert is a Professor of Sociology at the University of Stirling, UK, where he teaches courses on research methods and on social stratification. His research covers methodological topics in social survey data analysis and data management (with a particular interest in handling data on occupations and on ethnicity), and substantive studies into processes of social stratification and social inequality. His recent publications include a research monograph – Social Inequalities and Occupational Stratification – that analyses data on social interaction patterns and social inequalities, and an introductory textbook – What is… Quantitative Longitudinal Data Analysis – that focusses upon the secondary analysis of longitudinal survey datasets.
Key objectives
The course seeks to provide participants with a fluent understanding of selected issues in applied social statistics, particularly related to using statistical models productively. It also seeks to ensure participants develop confidence to implement the same techniques using Stata.
Participants should learn
- how relevant statistical models are formulated and interpreted
- the relative attractions and limitations of different model strategies
- practical skills in handling and analysing complex social data using Stata
Summary
Become fluent in using important tools of applied social statistics including: Multilevel models – Categorical outcomes models – Marginal effects – Panel models for longitudinal data – Selection models
This course is for participants with some previous training in social statistics but who are keen to widen their knowledge and expertise. Across the programme we explore selected statistical analytical topics and debates, using lectures and lab exercises, with emphasis on applications using real world microdata (e.g. from large-scale social surveys or survey-like sources).
All of our topics are in some way based on regression modelling approaches, but typically with adaptations or extensions tailored to different specialist requirements. The topics that we cover are all supported by fairly well-established software procedures, yet equally they are not routinely deployed by non-specialists.
A full list of the module’s topics:
Analytical techniques | Debates on procedures and outputs: |
Multilevel models | Marginal effects |
Categorical outcomes models | Sampling design and weighting adjustments |
Panel models for longitudinal data | Missing data |
Comparing fixed and random effects models | Estimating and representing uncertainty |
Selection models | Causal interpretations of model results |
Measurement models | Learning from simulated data |
Workflow and documentation considerations | |
Course Prerequisites:
The course can be thought of as an accessible introduction to selected advanced issues. Concepts, basic algebraic formulae, software training, and extension issues and debates will all be introduced in ways that focus on the social science contribution of the method.
It is expected that participants will have had some previous training in social statistics, for example on popular descriptive analytical techniques (e.g. chi-square tests; correlation values) and well-known types of regression models (e.g. multiple regression, logistic regression). Teaching sessions include some recap content, but concentrate on introducing the specialist topics listed above.
The course is best suited to participants with some previous experience in using Stata code or ‘syntax’. Course materials include some introductory resources, but students without any background in using Stata syntax should be prepared that extra effort will be needed near the start of the course in order to make good use of the lab exercises.
Background reading
Specific background study prior to attending the module is not required.
- Long, J.S. & Freese, J. (2014) Regression Models for Categorical Dependent Variables Using Stata, Third Edition, Tx: Stata Press [ISBN: 9781597181112] (will be provided by ESS)
- Rabe-Hesketh, S., & Skrondal, A. (2022). Multilevel and Longitudinal Modeling Using Stata (Volume 1), Fourth Edition. College Station, Tx: Stata Press [ISBN 9781597181365] (will be provided by ESS)
Before the course begins, participants might benefit from revising any text on basic statistical methods in the social sciences that has coverage of descriptive statistics and multiple regression models. Many alternative sources could be used for this purpose, but as examples we recommend:
- Kohler, H. P., & Kreuter, F. (2012). Data Analysis using Stata, 3rd edition. College Station, Tx: Stata Press.
- Treiman, D. J. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. New York: Jossey Bass. (Chpts 1-7)
Any participants keen to prepare further might benefit from reading introductory-level materials on any of the topics that are covered by the module. Selected recommendations of texts that deliberately take quite an introductory approach include:
- Gayle, V., & Lambert, P. S. (2018). What is Quantitative Longitudinal Data Analysis? London: Bloomsbury.
- Longhi, S., & Nandi, A. (2015). A Practical Guide to Using Panel Data. London: Sage
- Luke, D. A. 2004. Multilevel Modelling, Sage Quantitative Applications in the Social Sciences, Volume 143. London: Sage.
- Robson, K., & Pevalin, D. (2016). Multilevel Modeling in Plain Language. London: Sage.
During the summer school course, numerous readings are recommended for further study during or after the teaching programme.
Software
All topics are elaborated upon with multiple illustrative examples using Stata. We focus on Stata since it is well-equipped to support the topics and datasets being addressed. Introductory materials are available to participants with limited Stata experience whilst students completing the course can expect to develop relatively advanced Stata programming skills. Selected materials are also made available using SPSS and R, mainly for the purposes of comparative assessment of different software tools. Also of relevance:
- Whilst practical lab sessions are centred on exercises that use Stata, other lecture and study materials are normally software-independent. Nevertheless most statistical outputs within lectures will have been generated via Stata, and lecture contents occasionally address issues that are specific to Stata.
- Some participants are likely to be fluent in using Stata already, but extensive prior experience with Stata is not necessarily required, since introductory materials are available when needed. However, previous exposure to syntax programming in at least one statistical software package will be beneficial, since the software examples in the course use ‘syntax’ modes of operation. A note on software is prepared amongst the course materials which discusses and illustrates ways of using syntax effectively.
Background knowledge required:
Mathematics:
Calculus = Elementary
Linear Regression = Moderate
Statistics:
OLS = Moderate
Maximum Likelihood = Elementary
Computer Background:
Stata = Elementary