The prerequisites for taking a course are:
– Having taken a course(s) in the fundamentals of regression analysis (e.g. least squares method) (regardless of their academic affiliation or university)
– Possessing basic understanding of the statistical language R (e.g. manipulating data frames, reading and writing files, creating functions)
The courses are aimed for graduate students, but are also open to undergraduate students.
The courses will be taught in English. The tuition fee is £200 per course.
Upon the completion of the course, Essex University will issue their formal certification. It may be possible to transfer the course’s credits (2 credits) to your university’s credits. We advise you to consult with your department and university in advance if you seek to have the credits to be transferred. The School of Political Science and Economics at Waseda University will transfer the credits.
Course 1: Scaling Methods and Ideal Point Estimation (12-22 September 2022), by Dr. Royce Carroll
Course 2: Introduction to Programming for Big Data and Machine Learning in Social Science (12-22 September 2022), by Dr. Akitaka Matsuo
Please note that you cannot take the two courses due to time conflicts.
Tuition Fees: £200 per course
Application Deadline：August 26, 2022 (Fri)
Course 1: Scaling Methods and Ideal Point Estimation (12-22 September 2022; ), by Dr. Royce Carroll
Course 2: Introduction to Programming and Machine Learning with Big Data for Social Scientists (12-22 September 2022; ), by Dr. Matsuo Akitaka
Location: Waseda campus, Waseda University, Japan
£200 Internal WASEDA applicants
Application deadline 26 August.
• Students need to make lodging and travel arrangements on their own;
• Waseda cannot sponsor a student visa.
Scaling Methods and Ideal Point Estimation
Royce Carroll is a Professor in Comparative Politics at the University of Essex. His research focuses on representation and legislative politics, as well as methods to analyse survey and voting data, attitudes, preferences and ideology. He has previously taught at Rice University. He is co-author of the scaling method textbook Analyzing Spatial Models of Choice and Judgment (2nd Ed. 2020), as well as many articles on related topics.
This course focuses on methods to discover, understand and visualize latent patterns in data and is especially suited to students with projects using survey data and other forms of relational data used in political science, sociology, economics, business, marketing, and psychology. The course introduces students to measurement theory and methods of scaling techniques, integrating Multidimensional Scaling, Item Response Theory, and Ideal Point Estimation. The first part of the course will provide an overview of the foundations of these techniques and introduce students to the most common methods for scaling and “spatial” analysis and the visualization of latent patterns in survey and behavior data. The course will demonstrate how to interpret, measure, and visualize latent dimensions of data via a variety of scaling methods using the open-source programming language R. The course will also discuss a range of applications these methods to social science studies of relational and perception data derived from elite behaviour and surveys, especially for identifying latent preferences of political, economic and social actors. The course concludes with discussions of the most recent advances in the field, including applications for text analysis, and practical advice for those seeking to use such methods in social science research, relevant to the students enrolled. The course first covers how to analyse data from scales found in surveys (such as Likert-type scales), focusing on surveys that ask respondents to place themselves and / or stimuli on issue or attribute scales. The course begins with approaches to scaling to generate bias-adjusted and latent spatial data from survey responses, such as the Aldrich-McKelvey scaling and ‘Basic Space’ scaling with Anchoring Vignettes as methods for addressing perceptual bias in the form of “Differential Item Functioning.” The course next examines similarities and dissimilarities data and covers multidimensional scaling (MDS) with a focus on the SMACOF optimization method implemented in R as well as Bayesian applications to Metric Multidimensional Scaling. Next, the course covers unfolding analysis of rating scale data from surveys such as favorability scales for stimuli such as politicians or social groups. Finally, the course provides an extensive overview of IRT and ideal point estimation, generally focused on binary choice data, which includes those used in ‘roll call voting’ analysis of elite behavior in parliaments and courts. Here we will cover Poole and Rosenthal’s W-NOMINATE and Poole’s Optimal Classification unfolding method, as well as a variety of Bayesian analysis techniques for binary and ordinal choice data using Item Response Theory (IRT). An extensive range of Bayesian techniques is discussed, including Bayesian Aldrich-McKelvey Scaling, Ordinal and Dynamic Item Response Theory (IRT), Bayesian Multidimensional Scaling (MDS), and Bayesian Unfolding. The final section will discuss recent methods for scaling to a variety of different data types, including social media and text data, and the latest computation innovations to apply scaling methods to ‘big data’.
This course will enable students to derive latent spatial preference information and/or a dimensional structure from various types of survey and behavior data, which is applicable to a wide range of social science applications, academic and non-academic alike. Consumers of research based on these methods will also benefit from a deeper understanding of this type of methodology, its potential and its limitations.
Students will learn to use various computational methods to generate measures of ideology and preferences and understand the latent dimensional properties of social science data, including surveys and legislative data. Students will understand the theories behind these methods and the relationships between Item Response Theory, Ideal Point Estimation and other scaling methods. As these techniques are fundamental parts of much recent work in social science, students will be able to both understand and produce this research based on measuring concepts in this way.
The course is designed to be accessible to social science graduate students of all backgrounds. However, students familiar with the R programming environment will find it easier to adapt to course content and assignments, so it is recommended to familiarize oneself with the basic structure of R/Rstudio, such as via the 1-day introduction to R offered the Sunday before the first day of class. In addition, the course assumes basic familiarity with general statistics (OLS and MLE).
Armstrong, David A and Bakker, Ryan and Carroll, Royce and Hare, Christopher and Poole, Keith T and Rosenthal, Howard (2014) Analyzing spatial models of choice and judgment with R. Chapman and Hall/CRC. ISBN: 9781138715332 (electronic copy will be provided by instructor)
Background knowledge required
OLS = elementary
Maximum Likelihood = elementary
R = elementary
Introduction to Programming for Big Data and Machine Learning in Social Science
Akitaka Matsuo is a postdoctoral fellow in the Institute for Analytics and Data Science at the University of Essex. His research interests lie in data science and politics, in particular in the statistical methodology for scaling survey responses and legislative behavior and natural language processing of political texts (e.g. social media texts, open-ended survey answers, and parliamentary speeches).
Introduction to Programming for Big Data and Machine Learning in Social Science
The course is intended to provide social scientists with knowledge of how to carry out data science projects with the statistical language R.
For social scientists with a general knowledge of statistical analysis, there are essentially two hurdles to implementing a data science project. One is the handling of large data sets. As the scale of data increases, the necessary tools change. The other is, moving away from statistics in social science, which focuses on inference and explanation, to get into the mindset of machine learning, which focuses on prediction. Learning these two is the main goal of this course.
In addition, the course will touch on how to use version control systems to facilitate research accountability, which is nowadays essential in any data-based research project.
By taking this course, participants will have a foundation for conducting data science projects using R. In particular, they will learn:
- Advanced data management using R (tidyverse, data.table, and database)
- How to acquire data from the Internet (web scraping and accessing APIs)
- Fundamentals of machine learning
- Parallel computing in R
- Manage data projects using version control systems (i.e. github)
- R for Data Science (R4DS), by Hadley Wickham (O’Reilly, available at: https://r4ds.had.co.nz/ )
- An Introduction to Statistical Learning with Applications in R (ISL), Second Edition, by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (Springer, pdf available at the book website: https://www.statlearning.com/ )
- The caret package, by Max Kuhn (available at: https://topepo.github.io/caret/)
- Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining by Simon Munzert, Christian Rubba, Peter Meißner, Dominic Nyhuis (Weily)
- Learning SQL (3rd ed), by Alan Beaulieu (O’Reilly) or any introduction-to-SQL books (or some free online materials).
This course assumes that the participant has some knowledge of data analysis. The knowledge required is:
- Regression analysis (linear regression, logistic regression, etc.)
- Basic knowledge of the statistical language R
For regression analysis, it is recommended that students have taken an undergraduate or graduate course for social scientists; for R, advanced knowledge is not required, but the ability to read and write files, manipulate data frames, and perform some statistical analysis (e.g., mean difference test, linear regression, etc.) is desirable. Please consult with the instructor if you are unsure about any of the prerequisite knowledge.