Please note: This course will be taught online only. In person study is not available for this course.
Martijn Schoonvelde is Assistant Professor in European Politics & Society at the University of Groningen. His main research and teaching interests are political communication, political rhetoric, and quantitative text analysis.
Course Content
With the massive availability of text data on the web, social scientists increasingly recognize automated text analysis (or “text as data”) as a promising approach for analyzing various kinds of social and political phenomena. This module introduces participants to a variety of its methods and tools. We discuss the underlying theoretical assumptions, substantive applications of these methods, and their implementation in the R statistical programming language. The meetings – which combine lectures and coding sessions in the RStudio Cloud platform – will be hands-on, dealing with practical issues in each step of the research process.
Course Objectives
Participants will understand fundamental issues in quantitative text analysis research design such as inter-coder agreement, reliability, validation, accuracy, and precision. Participants will learn to convert texts into informative feature matrices and to analyze those matrices using statistical methods. Participants will learn to apply these methods to a text corpus in support of a substantive research question. Furthermore, participants will be able to critically evaluate (social science) research that uses automated text analysis methods.
Course Prerequisites
Familiarity with basic research design and statistical analysis is expected, and familiarity with the R statistical programming language is strongly encouraged.
Background Reading
Benoit (2020). “Text as Data: An Overview”. Handbook of Research Methods in Political Science and International Relations. Ed. by L. Curini and R. Franzese. Thousand Oaks: Sage: 461–497.
Welbers, K., Van Atteveldt, W., & Benoit, K. (2017). Text analysis in R. Communication Methods and Measures, 11(4), 245–265.
Background knowledge required:
Computer Background
R = elementary