Cristina Magder is the Data Collections Development Manager at the UK Data Archive at the University of Essex. She oversees the data collections functions and research data management training portfolio for the UK Data Archive (lead organisation for the UK Data Service). Her main teaching interests are data management planning, working, sharing and archiving data with specific focus on data quality assurance and disclosure control.

Dr. Alita Nandi is a Senior Research Fellow at the Institute for Social and Economic Research (ISER), University of Essex and Associate Director (Outreach) of Understanding Society overseeing its user support and training activities. She conducts empirical research to investigate ethnic and gender differences in subjective and economic wellbeing and to understand the formation of identity, the correlates of harassment & discrimination and their role in determining life outcomes. She has extensive experience of using panel data methods and large scale panel surveys including the BHPS, Understanding Society, NLSY79. She has co-authored a book with Dr. Longhi, A Practical Guide to Using Panel Data, which provides practical guidance for conducting panel data analysis using some of the popular panel datasets of the world. 

Dr Piotr Marzec is a Data Analyst and Training Officer at the Institute for Social and Economic Research (ISER) at the University of Essex where he is responsible for conducting a wide range of data analysis projects and developing and teaching courses on Understanding Society and panel data methods. In 2019 he was awarded a PhD in Sociology by the School of Sociology, Politics and International Studies (SPAIS), the University of Bristol. Before joining ISER he worked as a Research Associate in SPAIS at the University of Bristol. His research interests include class and social stratification, particularly in relation to lifestyles patterns. 

Course Content:

Longitudinal data are a powerful resource for socio-economic research. This course provides an introduction to reproducible research using an interactive R Markdown Notebook and the largest longitudinal household panel study as a case study, Understanding Society. The course focuses on best practices for data management while conducting reproducible research. It will guide participants through the process of conducting quantitative analysis covering such topics as: finding key survey information for their research, understanding the conditions for accessing data, basics of good data management practices, techniques for setting up and cleaning the data for longitudinal data analysis, and finally conducting longitudinal data analysis (pooled regression, fixed and random effects). Students will also learn to record, interpret and present the estimation results using visualisations methods. Students will work in groups, facilitated by course instructors, to complete an interactive R Markdown Notebook during the course.

Course Objectives:

The main objective of this course is to learn how to conduct reproducible research using longitudinal data. Students will gain an introductory understanding of data management techniques, they will acquire a working knowledge of data cleaning, data restructuring, basic longitudinal analysis methods and visualisation techniques using a variety of packages in R Studio. Working in groups and building on shared knowledge students will become familiarized with R Markdown. By the end of the course, students should be able to address research questions requiring longitudinal data analysis and confidently use R Markdown Notebooks.

Course Prerequisites:

This is an introductory course, however students are expected to have basic knowledge of cross-sectional linear regression (OLS) and basic understanding of R including data structures and functions. Students should have R and R studio installed on their PCs.

Representative background reading:

Cameron, C. and Trivedi, P.K. (2005) Microeconometrics: Methods and Applications Cambridge University Press.

Gayle, V. and Lambert, P. (2018) What is Quantitative Longitudinal Data Analysis? Bloomsbury Academic.

Longhi, S. and Nandi, A. (2015) A Practical Guide to using Panel Data. SAGE.

Mehmetoglu, M. and Jakobsen, T.G. (2016) Applied Statistics Using Stata A Guide for the Social Sciences Blackwell Inc.

Wooldridge, J. M. (2010) Econometric Analysis of Cross Section and Panel Data: Second Edition. The MIT Press.

Grolemund, G. (2014). Hands-on programming with R. Available at: [last accessed 9 Feb 2021]

Grolemund, G. and Wickham, H. (2016). R for Data Science. Available at [last accessed 9 Feb 2021]


Course Outline


Day 1 – Introduction to longitudinal and panel surveys

  • Outline of the course
  • Introduction to longitudinal & panel surveys
  • Case study: Overview of Understanding Society and its key features
  • Overview of R Studio and R Markdown Notebooks
  • Exercise: forming work groups and setting up group’s research question


Day 2 – Basics of data management, access and restrictions

  • FAIR data, licencing, terms and conditions and user responsibilities
  • Case study: accessing Understanding Society via the UK Data Service
  • Understanding the data structure (file and variable naming conventions, file content), and versioning
  • Exercise: importing data and producing basic descriptive statistics


Day 3 – Setting up the data, cleaning and inconsistencies

  • Setting up the data for longitudinal data methods
  • Understanding missing values, reporting errors and inconsistencies and weighting
  • Exercise: creating basic derived variables to measure change


Day 4 – Longitudinal data analysis and visualisation

  • Longitudinal data analysis: pooled regression, fixed effects, random effect
  • Visualisation fundamentals: bar charts, line charts, scatter plots and growth plots
  • Exercise: visualising your estimation results


Day 5Research findings and discussion

  • Analysing the data to address your research question
  • Presenting findings