1E Applied Social Statistics using Stata

Please note: This course will be taught in hybrid mode. Hybrid delivery of courses will include synchronous live sessions during which on campus and online students will be taught simultaneously.

Paul Lambert is a Professor of Sociology at the University of Stirling, UK, where he teaches courses on research methods and on social stratification at undergraduate and postgraduate level. He is the director of three related Masters courses on social research methodology in his institution, and he is a member of the Scottish Graduate School in Social Science postgraduate ‘training network’, delivering advanced training to PhD students across Scottish institutions. His research covers methodological topics in social survey data analysis and data management (with a particular interest in handling data on occupations), and applied research on processes of social stratification and social inequality in selected areas (recent funded projects include analysis of workplace changes and their relationship to changes in health and well-being; and survey data analysis of social inequalities experienced by Lesbian, Gay and Bisexual people in the UK).

Course Description
Become fluent in using important tools of applied social statistics including: Multilevel models – Categorical outcomes models – Marginal effects – Panel models for longitudinal data – Selection models.

This course helps participants to widen their knowledge and experience of applied social statistics. The programme introduces topics that are related to using regression modelling approaches in relatively sophisticated ways. It features accessible introductions to the theory and procedures involved in implementing selected tools of analysis, with lots of examples that use real world social survey (or survey-like) datasets.

Course Content

Multilevel models
Marginal effects
Categorical outcomes models
Sampling design and weighting adjustments
Panel models for longitudinal data
Missing data
Comparing fixed and random effects models
Estimating and representing uncertainty
Selection models
Causal interpretations of model results
Measurement models
Learning from simulated data
Workflow and documentation considerations

Course Prerequisites
The course can be thought of as an accessible introduction to selected advanced issues. Concepts, basic algebraic formulae, software training, and extension issues and debates will all be introduced in ways that focus on the social science contribution of the method.

It is expected that participants will have had some previous training in social statistics, for example on popular descriptive analytical techniques (e.g. chi-square tests; correlation values) and well-known types of regression models (e.g. multiple regression, logistic regression). Teaching sessions include some recap content, but concentrate on introducing the specialist topics listed above.

The course is best suited to participants with some previous experience in using Stata code or ‘syntax’. Course materials include some introductory resources, but students without any background in using Stata syntax should be prepared that extra effort will be needed near the start of the course in order to make good use of the lab exercises.

Representative Background Readings
Specific background study prior to attending the module is not required. During the course, texts by Long & Freese and by Rabe-Hesketh and Skrondal are used regular and made available to participants, and numerous other readings are recommended for further study.

Long, J.S. & Freese, J. (2014) Regression Models for Categorical Dependent Variables Using Stata, Third Edition, Tx: Stata Press [ISBN: 9781597181112]
Sophia Rabe-Hesketh and Anders Skrondal, Multilevel and Longitudinal Modeling Using Stata, Fourth Edition (2-volume set), ISBN: 978-1-59718-136-5 (provided by ESS)

Before the course begins, participants might benefit from revising any text on basic statistical methods in the social sciences that has coverage of descriptive statistics and multiple regression models. We recommend:

Kohler, H. P., & Kreuter, F. (2012). Data Analysis using Stata, 3rd edition. College Station, Tx: Stata Press.
Treiman, D. J. (2009). Quantitative Data Analysis: Doing Social Research to Test Ideas. New York: Jossey Bass. (Chpts 1-7)

Software
All topics are elaborated upon with multiple illustrative examples using Stata. We focus on Stata since it is well-equipped to support the topics and datasets being addressed. Introductory materials are available to participants with limited Stata experience whilst students completing the course can expect to develop relatively advanced Stata programming skills. Selected materials are also made available using SPSS and R, mainly for the purposes of comparative assessment of different software tools. Also of relevance:

Whilst practical lab sessions are centred on exercises that use Stata, other lecture and study materials are normally software-independent. Nevertheless most statistical outputs within lectures will have been generated via Stata, and lecture contents occasionally address issues that are specific to Stata.
Some participants are likely to be fluent in using Stata already. Extensive prior experience with Stata is not required, but it would be difficult to make good use of course materials without some previous exposure to using Stata ‘syntax’ code.

Background Knowledge
Maths:
Calculus – Elementary
Linear Regression – Elementary

Statistics:
OLS – Elementary

Software:
Stata – Elementary

The module’s teaching approach links theoretical introductions (in lectures) with class-led practical exercises (in ‘lab’ sessions).

Lecture-based introductions concentrate upon understanding the principles behind a particular approach and the practical impact of using it. Algebraic expositions are generally kept to a minimum, with the focus instead on what an approach is conceived to contribute, and how the outputs from an approach can be interpreted. In this style the course lectures can be thought of as providing an accessible introduction to relatively advanced or intermediate issues.

The lab sessions concentrate upon providing illustrative examples using Stata. Participants can expect to develop their Stata programming skills and leave the course readily able to adapt the illustrative examples to their own application areas.

A typical study day involves around two hours of lecture sessions which are designed cumulatively to introduce, explain and interrogate the topics, ultimately to a relatively advanced level. This is followed by around 1.5 hours of lab exercises in which participants are given illustrated guides to implementing techniques using software and to interpreting the results, as well as being encouraged to adapt the examples to their own research needs and datasets. Outwith scheduled class times (lectures and labs), further study materials including optional homework takes are available when desired, and options for follow-up queries such as drop-in sessions are offered.

Lectures (L) / Computer practicals (P)

Day 1 – Foundations in applied social statistics (i)
L1a:       Why statistical models can help us undertake social science research
L1b:       Course arrangements and overview
P1:         Using Stata for social science data analysis

Day 2 – Foundations in applied social statistics (ii)
L2a:       Getting to grips with complex datasets and the extension issues they can raise
L2b:       Tricks of the trade in working with statistical models
P2:         Exploring and summarising complex data; key elements of statistical modelling

Day 3 – Introducing and understanding multilevel models (i)
L3a:       Understanding and interpreting the two-level random intercepts model
L3b:       The two-level random slopes model
P3:         Two-level random effects model specifications and interpretations

Day 4 – Introducing and understanding models for categorical outcomes
L4a:       Understanding and implementing non-linear outcome models
L4b:       Using marginal effects constructively
P5:        Implementing and interpreting models for non-linear outcomes

Day 5 – Introducing and understanding multilevel models (ii)
L5a:       Multilevel models for categorical outcomes
L5b:       Random effects models with three and more levels and with cross-classified and multiple membership designs
P5:        Multilevel models for non-linear outcomes and with complex data structures

Day 6 – Panel models for longitudinal data
L6a:       Varieties of panel models and longitudinal data analyses strategies
L6b:       Comparing fixed and random effects models
P6:         Data and models for longitudinal panel datasets

Day 7 – Using models to study multiprocess systems
L7a:       Selection models
L7b:       Measurement models
P7:         Introductory examples of selected multiprocess models including selection models and SEMs

Day 8 – Models and analyses that focus on causal interpretations
L8a:       Reflecting on descriptive and causal analytical strategies
L8b:       Selected models designed to assess causal effects
P8:         Illustrating techniques for causal modelling in Stata

Day 9 – Focus on research applications in statistical modelling
L9a:       Class plenary: Participants’ projects that (may) use statistical models
L9b:       Option: Case study on statistical models for cross-national datasets
L9c:       Option: Review/Questions/Selected recap topics
P9:         Applied research – extension topics

Day 10 – Reflections and next steps
L10a:    Trends and prospects in using statistical models in the social sciences
L10b:    Making progress in applied research with complex quantitative data
P10:      Lab review/recap opportunity

Applied Social Statistics using Stata

Latest News

Networking Events

Apply now

1E Applied Social Statistics using Stata

Applied Social Statistics using Stata

Latest News

Networking Events

Apply now

Find us online!