Please note: This course will be taught in hybrid mode. Hybrid delivery of courses will include synchronous live sessions during which on campus and online students will be taught simultaneously.
Rabia Malik is Lecturer/Assistant Professor in the Department of Government at the University of Essex, which she joined in July 2020. Before this, she received her Ph.D. in Political Science from the University of Rochester in 2016, was a Post-Doctoral Associate at New York University Abu Dhabi (2016-2019) and spent a year at the Lahore University of Management Sciences (LUMS). Her research uses both observational and experimental data to study questions related to distributive politics and development, political accountability, clientelism, and gender, particularly in South Asia. She has also taught classes on quantitative methods, authoritarianism, and South Asian politics. Rabia’s research has appeared in The Journal of Politics, The British Journal of Political Science, Comparative Political Studies and Legislative Studies Quarterly.
Course description and goals:
This course introduces participants to the analysis of quantitative data in the free, open-source software R. R is a highly versatile software environment suitable for introductory and advanced quantitative social science and data analysis. The course offers participants a near-complete foundation to use R for all commonly encountered tasks in social science data analytics.
The course will explore the following topics:
- Introduction to the R language and software architecture
- Incorporating R code and document production (R Markdown)
- Workflow, reproducibility, and version control in R
- Data import and data management, including working with “messy” datasets
- Descriptive statistics
- Data visualization with base-R and advanced R packages
- Common techniques for statistical inference, including correlations, linear regressions, and logistic regressions
- Diagnostics for linear regression assumptions and violations
- Interpretation of non-linear relationships in OLS
- R packages for advanced statistical methods, including survey and field experiments, conjoint experiments, and regression discontinuity designs
Upon successful completion of the course, participants will be able to use R for most commonly encountered tasks in social science data analysis, including all of the topics listed above. The course is suitable for researchers at the beginning of their quantitative training as well as those with advanced background in quantitative social science wishing to acquire a new, free, open-source, and highly versatile set of tools. Applications from classic statistical methods (such as regression) toward newer tools (such as conjoint experiments and regression discontinuity designs) are supported. Participants will also learn to incorporate data analysis and document creation (via R Markdown). A workflow for reproducible data analysis is also a core element of the course.
Participants are advised to have a background in introductory statistics or concurrently be enrolled in an introductory statistics course. Prior initial exposure to statistical techniques up to linear regression (at a fundamental level) is helpful but not required. No background in R or computer programming is required. The course introduces R from a beginner’s perspective. At the same time, participants with experience in other tools (e.g., SPSS, Stata, or SAS) will find the course structure helpful to transfer their skillsets into R.
Representative Background Reading:
Since this is an introductory course, participants are not required to do any prior reading.
The following list can be used as reference readings by interested participants:
- Dalpiaz, David (2019). Applied Statistics with R. Online Resource.
- Huntington-Klein, Nick (2021). The Effect: An Introduction to Research Design and Causality. Online Resource.
- Kellstedt, Paul and Guy Whitten (2018). The Fundamentals of Political Science Research. Cambridge University Press.
- Imai, Kosuke (2018). Quantitative Social Science: An Introduction. Princeton: Princeton University Press.
- Salkind, Neil J. (2017). Statistics for People Who (Think They) Hate Statistics. Sage Publications.
- Wickham, Hadley, & Grolemund, Garrett (2017). R for Data Science. Online Resource.
Further readings mentioned in the course schedule, including those marked as optional, will be made available to participants during the course.
Required text (will be provided by ESS):
Agresti, Alan. (2018). Statistical Methods for the Social Sciences (Fifth Edition). Pearson.
Background knowledge required:
Calculus – Elementary
Linear Regression – Elementary*
OLS – Elementary*
* It is helpful if students have some background on these topics but not a requirement.