Gina Yannitell Reinhardt is a Senior Lecturer in the Government Department at the University of Essex. She joined the University in 2015 after 10 years in the Bush School of Government and Public Service at Texas A&M University. She studies disaster resilience and international development, and is beginning work a new €5 million project gauging resilience among elderly and isolated populations in the UK and France. She has founded the GLOBAL SOUTH ACADEMIC NETWORK and DISASTER AND EMERGENCY RESEARCH NETWORK to foster collaborative research on resilience and adaptation between scholars and organisations in the developing and developed worlds.
The purpose of this course is to provide tools to manage and work with large databases using Stata programming tools. The course is designed for new and intermediate Stata users who want to acquire advanced skills in data management and programming in Stata. The course focuses on skills relevant to social science data analysis.
Those completing this course should be able to:
1. Perform database management and estimation tasks using Stata.
2. Understand and use Stata programming routines and user-contributed .ado files.
3. Interpret Stata output.
4. Program new commands in Stata, from simple procedural commands to more complex estimation commands.
5. Install and use packages and produce graphics.
This is a course designed for new and intermediate Stata users who have a basic understanding of econometric analysis. Although participants’ familiarity with Stata may be introductory, it is expected that participants will be familiar with fundamental statistical concepts and vocabulary such as OLS, heteroskedasticity, and t-tests.
Representative Background Reading
You are expected to have knowledge of descriptive statistics, statistical significance, correlation, and sampling and estimation. An introductory statistics book from your field will cover these issues. One example would be:
Sirkin, R. Mark. Statistics for the social sciences. Sage Publications, 2005.
The following text will be provided by the Summer School as part of your course material and
used throughout the course:
Baum, Christopher F. An introduction to Stata programming. Vol. 2. College Station: Stata Press, 2009.
Background knowledge required
OLS = s
Maximum Likelihood = e
Stata = e
e = elementary, m = moderate, s = strong
Classes will meet for ten sessions. Each session will include demonstrations and exercises. The exercises are based on practical examples that may or may not be particularly meaningful to your specific research interests. I therefore encourage you to raise examples based on your own data and research needs. I am flexible and happy to move at the pace you desire. If there is anything specific you wish to know, or material for which you would like greater detail, I will do my best to accommodate these requests.
Computer-based exercises will feature prominently in the course, especially in the lab sessions. The use of all software tools will be explained in the sessions, including how to download plug-ins and add-ons. We will be working in Stata. It is recommended that you purchase and install a perpetual license for the most recent version of Stata SE on your personal laptop. If you would like to consider a different version, see options here. It will be possible to conduct all work for this class on University lab computers. Please note that if you do purchase your own Stata license for your laptop, it is expected that Stata will be purchased and installed on your laptop prior to the beginning of class.
• Baum, Christopher F. An Introduction to Stata Programming. Vol. 2. College Station: Stata Press, 2009.
Recommended Texts and Potentially Helpful Resources
• Cox, Nicholas J. and H. Joseph Newton. One Hundred Nineteen Stata Tips, Third Edition
• Long, J. Scott. The Workflow of Data Analysis Using Stata. College Station: Stata Press, 2009.
• Getting Started with Stata for Windows [or Mac, or Unix]. (freely available online)
• Acock, Alan C. A Gentle Introduction to Stata, 4th edition, 2014.
• The UCLA Stata guide (online)
• Baum, Christopher F. An Introduction to Modern Econometrics Using Stata. College Station: Stata Press, 2006.
Detailed Course Schedule
Day 1: Getting Started with Stata
• Course goals and logistics; topics overview
• Working with Stata: the Stata environment
• Help files, online PDF documentation
• Data import and description
o Chapter 1 (entire);
o Chapter 2 (entire)
• Long, Chapter 3 and Appendix A
• Getting Started with Stata for Windows/Mac/Unix
• Exercise 1: Getting started with Stata and do-files
Day 2: Database Manipulation and Basic Do-File Programming
• Describe, summarize, tabulate
• Cleaning data (missing values, changing variable type, naming and labelling variables)
• Logical expressions
• Generating new variables, dropping variables and observations
• Dummy variables
• Recoding, sorting, and combining datasets
• Beginning macros
o Chapter 3 (entire);
o Chapter 5, Sections 5.6-5.9
• Long, Chapters 5-6
• Acock, Chapter 3
• Exercise 2: Preparing your data
Day 3: Basic Statistical Routines, Regression and Post-Regression Analysis
• Mean, standard deviation, median, mode, correlation
• Cross-tabulation, Chi-squared test
• OLS, Logit
• Saving results
• Baum, Chapter 4 (entire)
• Acock, Chapters 5-7, 8, 10-11
• Exercise 3: Basic Analysis and Description
Day 4: Advanced Estimation Methods and Commands
• Time Series (tsset; date and time; time series operators)
• Panel Data (wide v. long; reshape; xtset and xtdes)
• `By’ and `Collapse’ commands
• Begin presenting results
o Chapter 5, Sections 5.1-5.5
o Chapters 6 and 8 (entire)
• Baum 2006, Chapters 7-10
• Exercise 4: Time, Panels, and Nested Models
Day 5: Presenting Results
• Regression results
• Graphics (scatter, line, CDF, histogram, combining graphs)
o Chapter 2, Sections 2.4-2.6;
o Chapter 3, Section 3.5
• Long, 7.7
• Acock, Chapters 5-6
• A Visual Guide to Stata Graphics, 3rd edition by Michael N. Mitchell, 2012
• Exercise 5: Generating Tables and Graphs
Day 6: Programming Basics: Prefixes, Loops, and Lists
• More with Macros
• Loops (foreach; forvalues)
• `If’ condition
• Combining loops and macros
• Baum, Chapter 7 (entire)
• Long, Chapters 4, 7
• Baum 2006, Appendix B
• Exercise 6: Looping Exercise
Day 7: Extended Do-File Programming
• Generating datasets
• Exporting variable contents
• Creating flexible output files
• Automating standard-format tables and graphs
• Marginal Effects
o Chapters 9-10
• Exercise 7: Generating and Plotting Marginal Effects
Days 8-9: Writing Stata Programs (ado-files)
• Creating/defining a program
• Implementing program options
• Documenting and Certifying programs
• Macro shift (where number of loops is variable)
• Naming and debugging programs
• Comments and long lines
• Programs with return values and other options
• Help files and publishing programs
• Baum , Chapters 11-12
• Exercise 8: Welcome to the ado!
• Exercise 9: More work with ado-files
Day 10: Choosing the Right Tools for your Needs
• TBA – topics of interest highlighted during Week 1
• TBA – topics of interest highlighted during Week 1
• Exercise 10: TBA – topics of interest highlighted during Week 1