Nicole Baerg is a Senior Lecturer in the Department of Government at the University of Essex. Previously, she was a Senior Researcher in Data Science at the Bank of England. She is also a Senior Visiting Fellow at the Data Science Institute, London School of Economics and Political Science. Her research areas include Comparative Politics & International Relations, Political Institutions, Central Banking, Political Text Analysis, and Computational Social Science.

Course Description
Build a complete foundation in R for data analysis, visualisation, and text analytics. This three-part course takes participants from basic plotting and regression analysis through to advanced techniques for handling text and unstructured data. Each part includes a dedicated lab day to apply learned concepts through guided, practical exercises.

Module 1: Data Visualisation and Communication
Learn to create effective, professional visualisations using ggplot2, Plotly, and spatial mapping tools to communicate data insights clearly and confidently.

Module 2: Statistical Modelling and Data Analysis
Gain confidence applying statistical methods in R, including regression, maximum likelihood estimation, and time series/panel data analysis.

Module 3: Text and Unstructured Data Analysis
Explore tools and techniques for text processing, document similarity, and classification to extract insights from unstructured data sources.

Each week concludes with a practical lab day that reinforces skills through applied analysis and reporting exercises.

Applications open soon!

Course Outline

Module 1: Data Visualisation and Communication
Develop professional data visualisation skills in R using ggplot2 and modern visualisation tools to produce publication-ready graphics and dashboards.

Part 1: Introduction to Plotting in R
This module introduces participants to the principles of effective data visualisation using ggplot2 and the grammar of graphics.
• Introduction to ggplot2 and the grammar of graphics
• Creating basic plots: scatterplots, bar charts, and line graphs
• Understanding aesthetic mappings and geometric objects
• Building layered visualisations
• Best practices for visual communication

Part 2: Advanced Plot Types and Customisation
Expand your plotting skills by learning to design polished, professional visualisations.
• Exploring plot types: boxplots, violin plots, heatmaps, and density plots
• Choosing the right plot for the right message
• Customising plots with themes, scales, and colour palettes
• Typography, layout, and accessibility in visual communication
• Creating consistent visual styles for presentations and reports

Part 3: Interactive and Spatial Visualisations
Learn to make your visualisations more engaging and dynamic.
• Creating interactive plots using Plotly
• Adding interactivity with tooltips and filters
• Introduction to spatial data and coordinate systems
• Creating maps and hex plots for geographical data

Part 4: Practical Lab – Visualisation Project
Apply your skills in a guided visualisation project.
• Design and build an end-to-end visualisation workflow
• Combine static, interactive, and spatial plots
• Create a short visual report communicating key findings

 

Module 2: Statistical Modelling and Data Analysis
Build a solid understanding of statistical methods in R to model relationships, test hypotheses, and analyse time-based or grouped data.

Part 1: Regression Techniques
Learn how to specify, fit, and interpret linear regression models.
•  Simple and multiple linear regression fundamentals
• Model specification and variable selection, including non-linear models
• Interpreting regression outputs and significance tests

Part 2: Maximum Likelihood Estimation (MLE)
Master MLE approaches to model complex data relationships.
• Understanding the likelihood function
• Comparing OLS and MLE estimation
• Binary outcomes: logistic regression (logit and probit models)
• Modelling count data using Poisson and related distributions

Part 3: Time Series and Panel Analysis
Analyse data that varies across both time and groups.
• Understanding panel structure (cross-sections over time)
• Fixed vs. random effects models
• Handling time-varying and time-invariant covariates
• Visualisation techniques: time series plots, trend lines, and heatmaps

Part 4: Practical Lab – Applied Modelling and Analysis
• Fit and interpret regression and MLE models using real datasets
• Build and validate time series models
• Present statistical results with effective visualisations and reports

 

Module 3: Text and Unstructured Data Analysis
Learn how to process, analyse, and model text data using modern techniques in R.

Part 1: Text Processing and Data Structures
Gain familiarity with text data formats and preprocessing workflows.
• String operations and pattern matching with regular expressions
• Text cleaning and standardisation (case, punctuation, whitespace)
• Working with text data structures (character vectors, data frames)
• Creating document structures and corpora

Part 2: Introduction to Text Analysis
Apply basic analytical methods to understand and compare documents.
• Document similarity and text matching techniques
• Building document-term matrices
• Introduction to text classification
• Naive Bayes classifiers for text
• Basic sentiment analysis using lexicons

Part 3: Applied Text Classification
Use predictive models to extract meaning from real-world text data.
• Logistic regression for text classification
• Feature extraction and dimensionality reduction
• Model evaluation and interpretation
• Visualising classification outcomes and performance

Part 4: Practical Lab – Text Analytics Project
• Build a complete text analysis workflow
• Combine preprocessing, modelling, and visualisation
• Present and communicate findings in a short applied project