Gina Yannitell Reinhardt is a Senior Lecturer in the Government Department at the University of Essex. She came to the University in 2015 from Texas A&M University, where she taught for 10 years in the Bush School of Government and Public Service. She studies disaster resilience and international development, and is founding the Global South Academic Network to help build capabilities in developing countries.

Course Content:
The purpose of this course is to provide tools to manage and work with large databases using Stata programming tools. The course is designed for new and intermediate Stata users who want to acquire advanced skills in data management and programming in Stata. The course focuses on skills relevant to social science data analysis.

Course Objectives
Those completing this course should be able to:
1. Perform database management and estimation tasks using Stata.
2. Understand and use Stata programming routines and user-contributed .ado files.
3. Interpret Stata output.
4. Program new commands in Stata, from simple procedural commands to more complex estimation commands.
5. Install and use packages and produce graphics.

Course Prerequisites:
This is a course designed for new and intermediate Stata users who have a basic understanding of econometric analysis. Although participants’ familiarity with Stata may be introductory, it is expected that participants will be familiar with fundamental statistical concepts and vocabulary such as OLS, heteroskedasticity, and t-tests.

Representative Background Reading:
You are expected to have knowledge of descriptive statistics, statistical significance, correlation, and sampling and estimation. An introductory statistics book from your field will cover these issues. One example would be:

Sirkin, R. Mark. Statistics for the social sciences. Sage Publications, 2005.

Required texts:
Baum, Christopher F. An introduction to Stata programming. Vol. 2. College Station: Stata Press, 2009.

Background knowledge required
OLS = s
Maximum Likelihood = e

Computer Background
Stata = e

e = elementary, m = moderate, s = strong

Classes will meet for ten sessions. Each session will include demonstrations and exercises. The exercises are based on practical examples that may or may not be particularly meaningful to your specific research interests. I therefore encourage you to raise examples based on your own data and research needs. I am flexible and happy to move at the pace you desire. If there is anything specific you wish to know, or material for which you would like greater detail, I will do my best to accommodate these requests.

Computer Software
Computer-based exercises will feature prominently in the course, especially in the lab sessions. The use of all software tools will be explained in the sessions, including how to download plug-ins and add-ons. We will be working in Stata. It is recommended that you purchase and install a perpetual license for the most recent version of Stata SE on your personal laptop. If you would like to consider a different version, see options here. It will be possible to conduct all work for this class on University lab computers. Please note that if you do purchase your own Stata license for your laptop, it is expected that Stata will be purchased and installed on your laptop prior to the beginning of class.

Required Text
• Baum, Christopher F. An Introduction to Stata Programming. Vol. 2. College Station: Stata Press, 2009.

Recommended Texts and Potentially Helpful Resources
• Cox, Nicholas J. and H. Joseph Newton. One Hundred Nineteen Stata Tips, Third Edition
• Long, J. Scott. The Workflow of Data Analysis Using Stata. College Station: Stata Press, 2009.
• Getting Started with Stata for Windows [or Mac, or Unix]. (freely available online)
• Acock, Alan C. A Gentle Introduction to Stata, 4th edition, 2014.
• The UCLA Stata guide (online)
• Baum, Christopher F. An Introduction to Modern Econometrics Using Stata. College Station: Stata Press, 2006.

Detailed Course Schedule

Day 1: Getting Started with Stata
• Course goals and logistics; topics overview
• Working with Stata: the Stata environment
• Help files, online PDF documentation
• Data import and description
• Do-files

Required Reading:
• Baum
o Chapter 1 (entire);
o Chapter 2 (entire)
Recommended Reading:
• Long, Chapter 3 and Appendix A
• Getting Started with Stata for Windows/Mac/Unix
Lab session:
• Exercise 1: Getting started with Stata and do-files

Day 2: Database Manipulation and Basic Do-File Programming
• Describe, summarize, tabulate
• Cleaning data (missing values, changing variable type, naming and labelling variables)
• Logical expressions
• Generating new variables, dropping variables and observations
• Dummy variables
• Recoding, sorting, and combining datasets
• Beginning macros

Required Reading:
• Baum
o Chapter 3 (entire);
o Chapter 5, Sections 5.6-5.9
Recommended Reading:
• Long, Chapters 5-6
• Acock, Chapter 3
Lab session:
• Exercise 2: Preparing your data

Day 3: Basic Statistical Routines, Regression and Post-Regression Analysis
• Mean, standard deviation, median, mode, correlation
• T-tests
• Cross-tabulation, Chi-squared test
• OLS, Logit
• Saving results

Required Reading:
• Baum, Chapter 4 (entire)
Recommended Reading:
• Acock, Chapters 5-7, 8, 10-11
Lab session:
• Exercise 3: Basic Analysis and Description

Day 4: Advanced Estimation Methods and Commands

• Time Series (tsset; date and time; time series operators)
• Panel Data (wide v. long; reshape; xtset and xtdes)
• `By’ and `Collapse’ commands
• 2SLS
• Begin presenting results

Required Reading:
• Baum
o Chapter 5, Sections 5.1-5.5
o Chapters 6 and 8 (entire)
Recommended Reading:
• Baum 2006, Chapters 7-10
Lab session:
• Exercise 4: Time, Panels, and Nested Models

Day 5: Presenting Results
• Tables
• Regression results
• Graphics (scatter, line, CDF, histogram, combining graphs)

Required Reading:
• Baum
o Chapter 2, Sections 2.4-2.6;
o Chapter 3, Section 3.5
Recommended Reading:
• Long, 7.7
• Acock, Chapters 5-6
• A Visual Guide to Stata Graphics, 3rd edition by Michael N. Mitchell, 2012
Lab session:
• Exercise 5: Generating Tables and Graphs

Day 6: Programming Basics: Prefixes, Loops, and Lists
• Comments
• More with Macros
• Loops (foreach; forvalues)
• `If’ condition
• Combining loops and macros

Required Reading:
• Baum, Chapter 7 (entire)
Recommended Reading:
• Long, Chapters 4, 7
• Baum 2006, Appendix B
Lab session:
• Exercise 6: Looping Exercise

Day 7: Extended Do-File Programming
• Generating datasets
• Exporting variable contents
• Creating flexible output files
• Automating standard-format tables and graphs
• Marginal Effects

Required Reading:
• Baum
o Chapters 9-10
Lab session:
• Exercise 7: Generating and Plotting Marginal Effects

Days 8-9: Writing Stata Programs (ado-files)
• Creating/defining a program
• Implementing program options
• Documenting and Certifying programs
• Macro shift (where number of loops is variable)
• Naming and debugging programs
• Comments and long lines
• Arguments
• Programs with return values and other options
• Help files and publishing programs

Required Reading:
• Baum , Chapters 11-12
Lab session:
• Exercise 8: Welcome to the ado!
• Exercise 9: More work with ado-files

Day 10: Choosing the Right Tools for your Needs
• TBA – topics of interest highlighted during Week 1

Required Reading:
• TBA – topics of interest highlighted during Week 1
Lab session:
• Exercise 10: TBA – topics of interest highlighted during Week 1