Waseda University is a large, private university with a main campus located in Shinjuku, Tokyo, Japan. First established in 1882 as the Tōkyō Senmon Gakkō or Tōkyō College by Ōkuma Shigenobu, the school obtained university accreditation and was formally renamed as Waseda University in 1902. The university consists of 13 undergraduate schools and 23 graduate schools. Waseda is one of a select group of top 13 universities assigned additional funding under the Japanese Ministry of Education, Culture, Sports, Science and Technology’s “Top Global Universities” Project. Waseda consistently ranks among the most academically selective and well-regarded universities in Japanese university rankings.

**Courses**

**Course 1: Maximum Likelihood Estimation **(22 hours)

Instructor: Dr Daina Chiba

Location: Waseda campus, Waseda University, Japan

Tuition Fees:

£200 Internal WASEDA applicants

£300 External Institution applicants

Application deadline 23 August.

**Course 2: Quantitative Text Analysis **(22 hours)

Instructor: Nicole Baerg

Location: Waseda campus, Waseda University, Japan

Tuition Fees:

£200 Internal WASEDA applicants

£300 External Institution applicants

Application deadline 23 August.

Please Note:

• Students need to make lodging and travel arrangements on their own;

• Waseda cannot sponsor a student visa.

**COURSE DESCRIPTIONS**

**Maximum Likelihood Estimation **

**Instructor**

Daina Chiba is a Senior Lecturer in the Department of Government at the University of Essex. A graduate of Rice University, he completed his postdoctoral fellowship at Duke University. His research interests encompass the areas of militarized conflict, international institutions, and political methodology. His work has appeared in Political Analysis, American Journal of Political Science, Journal of Politics, Political Science Research and Methods, Journal of Conflict Resolution, and Journal of Peace Research.

**Course Content**

In this course, students will learn how to build a statistical model to explain the variation of a categorical (binary, ordinal, nominal) dependent variable. They will learn how to build statistical models by properly specifying a likelihood function appropriate to their theory and data. They will then learn how to estimate the unknown parameters of these models using maximum likelihood estimation and how to produce measures of uncertainty (standard errors). Next, they will learn how to use the estimates of the parameters of the model to interpret its substantive implications mainly by calculating substantive effects of the form “my estimates suggest an additional year of education would increase an individual’s chance of turning out to vote by 3%.” Finally, students will learn how to use simulation techniques to put confidence intervals around these substantive effects of the form “my estimates suggest an additional year of education would increase an individual’s chance of turning out to vote by 3%, plus or minus 1%.” Throughout the course there will be an emphasis on how to best describe and explain the models they build and how best to communicate substantive implications to a broad academic audience.

The foundation of building a statistical model is proper development of a likelihood function and that requires an understanding of probability distributions. Thus, we will start with a brief introduction to probability theory at a level appropriate for students with no background in probability theory. The specific models we will subsequently cover are the Bernoulli-logistic model (logit), the normal-linear model (regression), ordered logit, multinomial logit, and event count models (e.g., Poisson, negative binomial).

**Objectives**

After finishing this course students should be able to use a wide variety of statistical models in their own work, understand the underlying assumptions of these models, be able to explain the ways in which the models are appropriate or not for the theory and data at hand, and to develop and interpret the substantive implications of the statistical estimates produced by these models.

**Prerequisites**

The course should be taken subsequent to a course on linear regression using OLS. Knowledge of basic calculus will be useful — though not strictly essential. No matrix algebra will be required. That said, statistical models are mathematical models and so we will use a lot of basic algebra and mathematical notation in order to formalize our theoretical intuitions into mathematical (statistical) models. Students should be ready to consume and produce models presented in this way.

**Representative Background Reading**

Gary King. 1998. Unifying Political Methodology. University of Michigan Press

**Statistical Software**

R

**Quantitative Text Analysis **

**Instructor**

Nicole Baerg is a Lecturer (Assistant Professor) at the Department of Government, University of Essex. She obtained her PhD in Politics at Emory University. She specializes in Methodology and Political Economy. Her work has appeared at Political Science Research and Methods, Comparative Political Studies, and Economics and Politics. She is the author of Crafting Consensus: Why Central Bankers Change their Speech and How Speech Changes the Economy, by Oxford University Press (Forthcoming).

**Course content**

The course surveys methods for systematically extracting quantitative information from text for social scientific purposes, starting with classical content analysis and dictionary-based methods, to classification methods, scaling methods, topic models and word embeddings. The course will focus on introducing students to a typical machine-learning workflow.

One important problem with big data is finding meaningful associations in texts all the while automating as much of the process as we can. Working with texts consists of assigning textual documents to one or more categories, based on the content of the document. In this course, we will examine three phases of text (or document) classification: Text annotation – how to annotate important features of text and how to use this information for document classification; Training -supervised, semi-supervised, and non-supervised approaches to text and document classification; Prediction (or classification) – classifying your texts and validating performance and accuracy.

**Objectives**

The course is also designed to cover many fundamental issues in machine learning and quantitative text analysis such as data management, labeling, inter-coder agreement, reliability, validation, accuracy, and precision.

**Prerequisites**

Students in this course should have prior knowledge in the following areas:

1) A basic understanding of probability and statistics. Understanding of regression analysis is presumed. Some basic understanding of maximum likelihood would be useful.

2) Basic familiarity with the R statistical language. The lab sessions will be designed to use R coupled with various R packages

**Representative Background Reading**

James, G., Witten, D., & Hastie, T. (2014). An Introduction to Statistical Learning: With Applications in R. Taylor and Francis

Salganik, Matthew J. 2017. Bit by Bit: Social Research in the Digital Age. Princeton, NJ: Princeton University Press. Open review edition.

Background knowledge required

Statistics

OLS = m

Maximum Likelihood = e

Computer Background

R = e

e = elementary, m = moderate, s = strong