Dr. Robert W. Walker: Is Associate Professor of Quantitative Methods in the Atkinson Graduate School of Management at Willamette University (2012-). He earned a Ph. D. in political science from the University of Rochester in 2005 and has previously held teaching positions at Dartmouth College, Rice University, Texas A&M University, and Washington University in Saint Louis. His current research develops and applies semi-Markov processes to time-series, cross-section data in international relations and international/comparative political economy. He teaches courses in quantitative methods/applied statistics and microeconomic strategy and previously taught four iterations in the U. S. National Science Foundation funded Empirical Implications of Theoretical Models sequence at Washington University in Saint Louis.

Course Content
This course is designed for students who already have training in basic statistics and knowledge of linear regression analysis. The course deals with problems arising from combining the time and space dimensions in statistical data analysis. We will work with aggregated time series cross-sectional data e.g. countries/firms/individuals over time. This data structure has the advantage of allowing for testing highly general theories with a wide scope but renders data analysis more complicated because one has to consider the time series aspects (dynamics) and cross-sectional aspects (spatial correlation/unit heterogeneity) at the same time. The course examines the problems arising from this complex data structure and provides techniques to control and account for specific complications. We will start out by discussing characteristics and types of pooled data and underlying assumptions of basic statistical models for panel data. We then address specification problems such as complex error structures, different kinds of heterogeneity (e.g. unit and slope), dynamic specification issues (lag structures), missing data, spatial heterogeneity and dependency, time invariant and rarely changing variables in panel data analysis with correlated unit specific effects among others. Furthermore, we will look at different data generating processes and adequate estimation procedures for limited dependent variables. The course combines a more theoretical introduction with practical analysis of diverse data sets using STATA. Students are encouraged to bring their own data sets and present their research puzzles as motivating examples.

Course Objectives
The course requires knowledge of inferential statistics and considerable linear algebra (matrices) and is designed to further develop the understanding of statistical problems arising from the complex structure of pooled data. The course mostly deals with questions of specification and model choice; it is a practical course that enables students to more tightly link empirical models with their theoretical arguments and make model choices that are adequate for the data structure at hand. The course materials are designed to help participants to solve their own estimation problems and increase the reliability and efficiency of their statistical results. The course is targeted to social scientists, business academics with average (or better) statistical skills with a strong interest in applied empirical research and data analysis. The focus lies on practical problems of macro panel data analysis.

Course Prerequisites
The course requires average or better skills and knowledge in inferential statistics, including basic understanding of maximum likelihood and generalized linear estimation methods. In addition, participants should have an understanding of matrix algebra. In addition, participants need a basic familiarity with STATA. The course is designed to build on a good working knowledge of cross-section multiple regression models and basic multivariate time-series models. This includes knowledge of the underlying assumptions of basic linear models (principally stationarity) and how to deal with violations (heteroskedasticity, autocorrelation) of Gauss-Markov assumptions. Participants should be able to interpret regression coefficients, standard errors and significance tests.

Representative Articles and Texts
Beck, Nathaniel 2001: Time-Series-Cross-Section Data: What Have We Learned in the Past Few Years? Annual Review of Political Science 4: 271-293.

Beck, Nathaniel and Jonathan Katz 1995: What to do (and not to do) with Time-Series Cross-Section Data, American Political Science Review 89: 634-647.

Beck, Nathaniel and Jonathan N. Katz 2007: Random Coefficient Models for Time-Series-Cross-Section Data: Monte Carlo Experiments: Political Analysis 15: 182-195.

Plümper, Thomas and Vera E. Troeger 2007: Efficient Estimation of Time-Invariant and Rarely Changing Variables in Finite Sample Panel Analyses with Unit Fixed Effects: Political Analysis 15: 124-139.

Plümper, Thomas, Troeger, Vera E. and Philip Manow 2005: Panel Data Analysis in Comparative Politics. Linking Method to Theory: European Journal of Political Research 44: 327-354.

Wawro, Gregory 2002: Estimating Dynamic Panel Data Models in Political Science: Political Analysis 10: 25-48.

Wilson, Sven E. and Danial M. Butler 2007: A Lot More to Do: The Sensitivity of Time-Series Cross-Section Analyses to Simple Alternative Specifications. Political Analysis 15: 101-123.

Wooldridge, Jeffrey M. 2002: Econometric Analysis of Cross Section and Panel Data, MIT Press, Cambridge.

Notes on Readings
Most basic and introductory econometrics textbooks include a chapter on panel-data and pooled models. Wooldridge (2002), Hsiao (2003) and Baltagi (2008) are more advanced and technical treatments, the latter two with exclusive emphasis on panel data. Beck and Katz (1995) develop the work-horse model for political scientists. Beck (2001) summarizes the discussion of pooled analysis in political science. Specification and conceptual issues are discussed in Plümper et al. (2004). Wawro (2002) and Wilson et al. (2007) introduce issues arising from the dynamic dimension of panel data and compare different models/specifications for dynamic panel data. Beck and Katz (2007) discuss slope heterogeneity and the application of random co-efficients models. Plümper and Troeger (2007) raise the issue of time-invariant and rarely changing variables in panel data analysis with correlated unit specific effects and suggest a solution to the problem.

Background knowledge required
Maximum Likelihood = e

Computer Background
Stata = e

e = elementary, m = moderate, s = strong

Daily Schedule and Readings

Week 1

Day 1
Regression Overview, Introduction to Pooling/Time Series:
Berk and Freedman (2003) and Hsiao (N.d.) (optional: Hicks (1994)).
Key Issue: T = B + W
Exercise: Summarizing data in Stata.

Day 2
Easing Us In: Models for TSCS/CSTS Data:
Wooldridge, ch. 10; Stimson (1985), Beck (2001), Beck and Katz(1995)
Key Issue: Separating Dimensions and Effects
Exercise: Estimate the basic models in Stata for replication data.

Day 3
Unit Heterogeneity and Slope Heterogeneity:
Hsaio, ch. 6; Mundlak (1978); Hausman (1978); Beck and Katz (2007) [optional:
Troeger (N.d.)].
Key Issue: What models do we compare and how?
Exercise: Implementation: What model would you choose? Why? Monte Carlo Simulation and Bootstraps.

Day 4
Recapping and Getting Practical
Wilson and Butler (2007); Plumper, Troeger and Manow (2005)
Key Issue: Work backward from substance.
Exercise: Mixtures and Comparison

Day 5
A Pre-Weekend Think: Dynamics and Thinking about Time:
Enders, ch. 2; Beck and Katz (N.d.); Whitten and Williams (2012)
Key Issue: What do dynamics mean WITH heterogeneity?
Exercise: The range and diversity of dynamic models and interpretation.

Week 2

Day 6
Dynamic Panel Data Estimators (With a little IV)
Cameron and Trivedi, ch. 22; Plumper and Troeger (2007); Wawro (2002)
Key Issue: Valid Instruments and Instrumentation in two dimensions.
Exercise: Estimating DPDs and FEVD.

Day 7
Exploring Missing Data and Missingness
Honaker and King (2010) and Horton and Kleinman (2007)
Key Issue: Missing Data are nasty but 2-D gives leverage.
Exercise: Imputation and Combination

Day 8
To Generic Data, Part I
Baltagi, ch. 11; (Dirty Pool controversy *).
Key Issue: Information and Fixed Effects
Exercise: Replicate the DP.

Day 9
To Generic Data, Part II
Arrellano, Appendix; Beck, Katz and Tucker (1998); Carter and Signorino (2010); Beck et al. (N.d.).
Key Issue: Dynamics are interesting with limited outcomes.
Exercise: Comparing non-nested models?

Day 10
Causation in a Panel Setting and Course Review
Hood, Kidd and Morris (2008); Blackwell and Glynn (N.d.)
Key issue: Causation and order in time are a crucial source of potential leverage.
Exercise: Applications in the lab

*: Skim the International Organization debate including Green, Kim and Yoon (2001), Oneal and Russett (2001),
Beck and Katz (2001), and King (2001).


Beck, Nathaniel. 2001. “Time-Series-Cross-Section Data: What Have We Learned in the Past Few Years?”
Annual Reviews of Political Science 4:271–93.
URL: http://www.nyu.edu/gsas/dept/politics/seminars/beck$_$tscs.pdf

Beck, Nathaniel, David Epstein, Simon Jackman and Sharyn O’Halloran. N.d. “Alternative Models of Dynamics in Binary Time-Series-Cross-Section Models: The Example of State Failure.” Paper presented at the 2001 Annual Meeting of the Society for Political Methodology, Emory University (Draft: July 12, 2002).
URL: http://www.nyu.edu/gsas/dept/politics/faculty/beck/emory.pdf

Beck, Nathaniel and Jonathan N. Katz. 2001. “Throwing out the Baby with the BathWater: A Comment on Green, Kim, and Yoon.” International Organization 55(2):487–495.
URL: http://www.jstor.org/stable/3078640

Beck, Nathaniel, Jonathan N. Katz and Richard Tucker. 1998. “Taking time seriously: Time-series-cross-section analysis with a binary dependent variable.” American Journal of Political Science 42(4):1260–1288.
URL: http://www.jstor.org/stable/2991857

Beck, Nathaniel L. and Jonathan Katz. N.d. “MODELING DYNAMICS IN TIME SERIES? CROSSSECTION POLITICAL ECONOMY DATA.” California Institute of Technology Social Science Working Paper 1304 (June 2009).

Beck, Nathaniel L. and Jonathan N. Katz. 1995. “What to Do (and Not to Do) with Time-Series-Cross-Section Data in Comparative Politics.” American Political Science Review 89(3):634–647.
URL: http://www.jstor.org/stable/2082979

Beck, Nathaniel L. and Jonathan N. Katz. 2007. “Random Coefficient Models for Time-Series-Cross-Section Data: Monte Carlo Experiments.” Political Analysis 15(2):182–95.
URL: http://pan.oxfordjournals.org/cgi/reprint/15/2/182

Berk, R. A. and D. A. Freedman. 2003. Statistical Assumptions as Empirical Commitments. In Law, Punishment, and Social Control: Essays in Honor of Sheldon Messinger, ed. T. G. Blomberg and S. Cohen. Second ed. Aldine de Gruyter chapter 10, pp. 235–54.
URL: http://stat-www.berkeley.edu/˜census/berk2.pdf

Blackwell, Matthew and Adam Glynn. N.d. “How to Make Causal Inferences with Time-Series Cross-Sectional Data.” version: July 12, 2013.

Carter, David B. and Curtis S. Signorino. 2010. “Back to the Future: Modelling Time Dependence in Binary
Data.” Political Analysis 18(3):271–292.
URL: http://pan.oxfordjournals.org/content/18/3/271.abstract

Green, Donald P., Soo Yeon Kim and David H. Yoon. 2001. “Dirty Pool.” International Organization 55(2):441–68.
URL: http://www.jstor.org/stable/3078638

Hausman, J. A. 1978. “Specification Tests in Econometrics.” Econometrica 46(6):1251–71.
URL: http://www.jstor.org/stable/1913827

Hicks, Alexander M. 1994. Introduction to Pooling. In The Comparative Political Economy of the Welfare State, ed. Thomas Janoski and Alexander M. Hicks. Cambridge University Press.

Honaker, James and Gary King. 2010. “What to do About Missing Values in Time Series Cross-Section
Data.” American Journal of Political Science 54:561–581.
URL: http://gking.harvard.edu/files/abs/pr-abs.shtml

Hood, M. V., Q. Kidd and I. L. Morris. 2008. “Two Sides of the Same Coin: Employing Granger Causality Testing in a Time Series Cross-Section Framework.” Political Analysis 16(3):324–44.

Horton, Nicholas J. and Ken P. Kleinman. 2007. “Much ado about nothing: A comparison of missing data methods and software to fit incomplete data regression models.” The American Statistician 61(1):79–90.

Hsiao, C. N.d. “Why Panel Data?” Institute for Economic Policy Research 05.33.
URL: http://ideas.repec.org/p/scp/wpaper/05-33.html

King, Gary. 2001. “Proper Nouns and Methodological Propriety: Pooling Dyads in International Relations
Data.” International Organization 55(2):497–507.
URL: http://www.jstor.org/stable/3078641

Mundlak, Yair. 1978. “On the Pooling of Time Series and Cross Section Data.” Econometrics 46(1):69–85.
URL: http://www.jstor.org/stable/1913646

Oneal, John R. and Bruce Russett. 2001. “Clear and Clean: The Fixed Effects of the Liberal Peace.” International Organization 55(2):469–485.
URL: http://www.jstor.org/stable/3078639

Plumper, Thomas and Vera Troeger. 2007. “Efficient Estimation of Time-Invariant and Rarely Changing Variables in Finite Sample Panel Analyses with Unit Fixed Effects.” Political Analysis 15(2):124–139.
URL: http://pan.oxfordjournals.org/cgi/reprint/15/2/124

Plumper, Thomas, Vera Troeger and Philip Manow. 2005. “Panel data analysis in comparative politics:
Linking method to theory.” European Journal of Political Research 44:327–54.

Stimson, James A. 1985. “Regression in Space and Time: A Statistical Essay.” American Journal of Political Science 29(4):914–47.
URL: http://www.jstor.org/stable/2111187

Troeger, Vera. N.d. “Problematic Choices.” Paper presented at the Annual Meetings of the American Political Science Association, Toronto, ON.

Wawro, Gregory. 2002. “Estimating Dynamic Panel Data Models in Political Science.” Political Analysis 10(1):25–48.
URL: http://pan.oxfordjournals.org/cgi/reprint/10/1/25

Whitten, Guy B. and Laron D.Williams. 2012. “ButWait, There’s More: Maximising Substantive Inferences from TSCS Models.” Journal of Politics 74(3):685–93.

Wilson, Sven E. and Daniel M. Butler. 2007. “A Lot More to Do: The Sensitivity of Time-Series Cross-Section Analyses to Simple Alternative Specifications.” Political Analysis 15(2):101–23.
URL: http://pan.oxfordjournals.org/cgi/reprint/15/2/101