Please note: This course will be taught online only. In person study is not available for this course.
Martijn Schoonvelde is Assistant Professor in European Politics & Society at the University of Groningen. His main research and teaching interests are political communication, political rhetoric, and quantitative text analysis.
With the explosion of digital text data, social scientists are increasingly leveraging sophisticated quantitative text analysis techniques to learn from these data. This course provides a comprehensive introduction to cutting-edge methods and tools in the field of “text-as-data”. We discuss substantive applications of these methods, their theoretical assumptions, and their implementation in the R statistical programming language. The meetings – which combine lectures and coding sessions – will be hands-on, dealing with practical issues in each step of a text as data project.
Participants will understand fundamental issues in quantitative text analysis research design such as different types of textual representations; measurement versus prediction; multilingualism; and how to think about validation. Participants will learn to convert texts into informative feature matrices and to analyze those matrices using statistical methods. Participants will learn to apply these methods to a text corpus in support of a substantive research question. Furthermore, participants will be able to critically evaluate (social science) research that uses automated text analysis methods.
Familiarity with basic research design and statistical analysis is expected, and familiarity with the R statistical programming language is encouraged.
Required Text (this will be provided by ESS)
Grimmer, J., Roberts, M.E. and Stewart, B.M., 2022. Text as data: A new framework for machine learning and the social sciences. Princeton University Press. ISBN: 9780691207551
Background knowledge required:
R = elementary
Calculus = elementary