Please note: This course will be taught online only. In person study is not available for this course. 

Johannes Karreth is Associate Professor in the Department of Politics and International Relations at Ursinus College near Philadelphia, Pennsylvania. He earned his Ph.D. at the University of Colorado Boulder and was previously Assistant Professor at the University at Albany-State University of New York. His research studies the impact of international actors and processes on politics, ranging from the macro-level (interstate conflicts, civil wars, trade disputes) to the micro-level (public opinion on globalization and immigration). His main interests in political methodology are Bayesian approaches to multilevel structures, and data visualization. He is the author of Incentivizing Peace: How International Organizations Can Help Prevent Civil Wars in Member Countries (with Jaroslav Tir; Oxford University Press 2018). His research has appeared in the Journal of Politics, Journal of Conflict Resolution, Journal of Peace Research, International Interactions, Comparative Political Studies, and other journals.

The course will introduce participants to the open-source statistical software R, with the goal of empowering participants to write and use R code for a wide variety of quantitative applications. R is a highly versatile software environment suitable for introductory and advanced quantitative social science and data analysis. The course offers participants a near-complete foundation to use R in other courses at ESS.

Specifically, the course will explore the following topics:

· Introduction to the R language and software architecture
· Use of the tidyverse suite of R packages
· Incorporating R code and document production (R Markdown)
· Workflow, reproducibility, and version control in R
· Data import and data management, including working with ”messy” datasets
· Data visualization
· Basic functions

Course Objectives

Upon successful completion of this course, participants will have acquired intermediate R skills aligned with most ESS courses that rely on R. The course is suitable for researchers at the beginning of their quantitative training as well as researchers with advanced background in quantitative social science wishing to acquire a new, free, open-source, and highly versatile set of tools. A workflow for reproducible data analysis is also a core element of the course. 

Course Prerequisites


This is an introductory, highly accessible course. No prior knowledge of statistical or quantitative methods, or R, or computer programming is required. Participants with experience in other tools (e.g. SPSS, Stata, or SAS) will find the course structure helpful to transfer their skillsets into R.

Representative Background Reading


Since this is an introductory course, participants are not required to complete any prior reading.

Required texts

Wickham, Hadley, Mine Çetinkaya-Rundel, and Garrett Grolemund. 2023. R for Data Science, 2nd edition. Sebastopol, CA: O’Reilly. ISBN: 9781492097402

Remote learning setup

Lectures will run on Zoom. Participants are invited to interrupt and ask questions any time. Much of the course will consist of 1-on-1 or small-group exercises with the teaching staff. These will take place during the core course time. I will also be available for virtual office hours during every day of the workshop, both during fixed times and by appointment.

Software and Preparation

Participants will be asked to install R and RStudio on their personal laptops during the first course meeting. We will go over how to use these programs on the first day of the course, using a detailed tutorial with step-by-step instructions. We will also have time to catch up on installation problems on the first day.

Course schedule

For each day, the core reading usually provides substantial details for the units discussed on that day. A typical course period will consist of the following:

· Lectures are self-contained mini-units mixing lecture and discussion.
· Labs are guided tutorials with documented scripts available to participants.
· Assignments are problem sets that participants may complete to reinforce the material learned in the course on that respective day.

The following time slots and topics will likely be modified as the course proceeds. The most current version of this document can be found at http://www.jkarreth.net/intror-essex.html.

Day       Unit        Topic                                                                                             Chapter
Mon 3/7    1           Introduction to the R language and software architecture       1
Mon 3/7    2           Data visualization & workflow                                                        3-4
Tue 4/7     3           Data transformation                                                                         5-6
Tue 4/7     4           Exploratory data analysis                                                                7
Wed 5/7    5           Data wrangling                                                                                  10-16
Thur 6/7    6          Presentation: RMarkdown                                                              26, 27, 29, 30
Thur 6/7    7          Presentation: Tables and graphs
Fri 7/7       8           R for statistical modeling
Fri 7/7       5           Best practices and Q&A