Martin Elff is Professor of Political Sociology at Zeppelin University in Friedrichshafen, Germany. He is a political scientist with research interests in the fields of political behaviour, party competition, and political methodology. He has published in the European Journal of Political Research, Perspectives on Politics, Electoral Studies and Political Analysis and has authored the R packages ‘memisc’, ‘mclogit’, and ‘munfold’, published at the ‘Comprehensive R Archive Network’ (http://cran.r-project.org).(http://cran.r-project.org).

**Course Content**

The module introduces to the practical analysis of quantitative social science data using R. Consequently, the module is not so much a theoretical presentation of concepts such as probability, expectation, regression, statistical significance etc. but rather emphasizes enabling participants to “road-test” such concepts with the help appropriate software, in particular the open source software package R.

This module covers at least the following topics: (1) basic concepts of data analysis with R; (2) data management – working with variables and data frames; (3) summarising data using tables and graphics; (4) linear regression – model construction and interpretation; (5) testing statistical hypotheses in R (5); generalised linear models for categorical responses, counts, and survival times; (6) advanced statistical graphics; (7) multivariate data analysis – principal components, factor analysis, and structural equations. In addition to these, a few more topics are optionally covered if time permits, such as random variables, random numbers and Monte Carlo simulations; linear algebra with R and regression in matrix form; multilevel models; or programming techniques – depending on participants’ interests).

Course Objectives

Participants who successfully complete this module will have a solid understanding of the general principles of data analysis and how to put them into practice. They will also have an understanding of the issues and main techniques of multivariate statistical analysis. While a two week course can hardly cover all in depth, successful participants will at least be able to identify which of these techniques are appropriate for their research. Further they will be able to graph their data and conduct their data analysis with the free statistical software system R.

**Course Prerequisites**

The module introduces to a variety of techniques of data analysis and therefore has only little prerequisites. In order to be able to follow the course participants should have a solid understanding of descriptive statistics and regression. They should also have a certain level of “computer literacy”, that is, they should not be afraid of command-line oriented (as opposed to menu-driven) software and of writing short command scripts. The ability to do that is not pre-supposed, but the motivation to learn such things is.

**Representative Background Reading **

Dalgaard, Peter 2002. Introductory Statistics with R. New York: Springer.

Fox, John 2008. Applied Regression Analysis, and General Linear Models. (2nd ed.) Thousand Oaks: Sage.

Fox, John 2002. An R and S-Plus Companion to Applied Regression. Thousand Oaks: Sage.

Gill, Jeff 2006. Essential Mathematics for Political and Social Research. Cambridge: Cambridge University Press.

Venables, W.N., and Ripley, B.D. 2002. Modern Applied Statistics with S. (4th ed.) New York: Springer.

**Background knowledge required***Statistics*

OLS = m

Maximum Likelihood = e

e = elementary, m = moderate, s = strong