The course is now full. New applicants will be added to a waiting list.

Please note: This course will be taught online only. In person study is not available for this course. 

Johannes Karreth is Assistant Professor in the Department of Politics and International Relations at Ursinus College near Philadelphia, Pennsylvania. He earned his Ph.D. at the University of Colorado Boulder and was previously Assistant Professor at the University at Albany-State University of New York. His research studies the impact of international actors and processes on politics, ranging from the macro-level (interstate conflicts, civil wars, trade disputes) to the micro-level (public opinion on globalization and immigration). His main interests in political methodology are Bayesian approaches to multilevel structures, and data visualization. He is the author of Incentivizing Peace: How International Organizations Can Help Prevent Civil Wars in Member Countries (with Jaroslav Tir; Oxford University Press 2018). His research has appeared in the Journal of Politics, Journal of Conflict Resolution, Journal of Peace Research, International Interactions, Comparative Political Studies, and other journals.

Course Content
The course will introduce participants to the open-source statistical software R.

Topics include:

– Introduction to the R language and software architecture

– Use of the tidyverse suite of R packages

– Incorporating R code and document production (R Markdown)

– Workflow, reproducibility, and version control in R

– Data import and data management, including working with “messy” datasets

– Data visualization

– Basic functions 

Course Objectives
Upon successful completion of this course, participants will have acquired intermediate R skills aligned with most ESS courses that rely on R. The course is suitable for researchers at the beginning of their quantitative training as well as researchers with advanced background in quantitative social science wishing to acquire a new, free, open-source, and highly versatile set of tools. A workflow for reproducible data analysis is also a core element of the course. 

Course Prerequisites
This is an introductory, highly accessible course. No prior knowledge of statistical or quantitative methods, or R, or computer programming is required. Participants with experience in other tools (e.g. SPSS, Stata, or SAS) will find the course structure helpful to transfer their skillsets into R.

Representative Background Reading
Since this is an introductory course, participants are not required to complete any prior reading.

Required texts
Wickham, Hadley and Garrett Grolemund. 2016. R for Data Science. Sebastopol, CA: O’Reilly. Note: this book is available at no cost as an e-book at https://r4ds.had.co.nz/index.html.

Course description and goals

The course will introduce participants to the open-source statistical software R. R is a highly versatile software environment suitable for introductory and advanced quantitative social science and data analysis. The course offers participants a near-complete foundation to use R in other courses at ESS.
Specifically, the course will explore the following topics:

· Introduction to the R language and software architecture
· Use of the tidyverse suite of R packages
· Incorporating R code and document production (R Markdown)
· Workflow, reproducibility, and version control in R
· Data import and data management, including working with ”messy” datasets
· Data visualization
· Basic functions

Upon successful completion of this course, participants will have acquired intermediate R skills aligned with most ESS courses that rely on R. The course is suitable for researchers at the beginning of their quantitative training as well as researchers with advanced background in quantitative social science wishing to acquire a new, free, open-source, and highly versatile set of tools. A workflow for reproducible data analysis is also a core element of the course. The course content will be reinforced through regular hands-on exercises and frequent feedback from the instructor.

Remote learning setup

Lectures will run on Zoom. Participants are invited to interrupt and ask questions any time. Much of the course will consist of 1-on-1 or small-group exercises with the teaching staff. These will take place during the core course time. I will also be available for virtual office hours during every day of the workshop, both during fixed times and by appointment.

Prerequisites

This is an introductory, highly accessible course. No prior knowledge of statistical or quantitative methods, or R, or computer programming is required. Participants with experience in other tools (e.g. SPSS, Stata, or SAS) will find the course structure helpful to transfer their skillsets into R.

Literature

We will use the following text, which is available at no cost as an e-book at https://r4ds.had.co.nz/index.html:

· Wickham, Hadley and Garrett Grolemund. 2016. R for Data Science. Sebastopol, CA: O?Reilly.

Further readings and materials will be made available to participants during the course.

Software and Preparation

Participants will be asked to install R and RStudio on their personal laptops during the first course meeting. We will go over how to use these programs on the first day of the course, using a detailed tutorial with step-by-step instructions. We will also have time to catch up on installation problems on the first day.

Course schedule

For each day, the core reading usually provides substantial details for the units discussed on that day. A typical course period will consist of the following:

· Lectures are self-contained mini-units mixing lecture and discussion.
· Labs are guided tutorials with documented scripts available to participants.
· Assignments are problem sets that participants may complete to reinforce the material learned in the course
on that respective day.

The following time slots and topics will likely be modified as the course proceeds. The most current version of this document can be found at http://www.jkarreth.net/intror-essex.html.

Day       Unit        Topic                                                                                             Chapter
Mo 7/4    1           Introduction to the R language and software architecture       1
Mo 7/4    2           Data visualization & workflow                                                        3-4
Tu 7/5     3           Data transformation                                                                         5-6
Tu 7/5     4           Exploratory data analysis                                                                7
We 7/6    5           Data wrangling                                                                                  10-16
Th 7/7     6           Presentation: RMarkdown                                                              26, 27, 29, 30
Th 7/7     7           Presentation: Tables and graphs
Fr 7/8      8           R for statistical modeling
Fr 7/8      5           Best practices and Q&A