*We are no longer accepting applications for this course*

Arnaud Vaganay is a methodologist and meta-researcher. He currently heads Meta-Lab, a London-based organisation specialised in meta-research (or research-on-research) and open science training. He is also a visiting lecturer at the London School of Economics and Sciences Po. His academic research meta-analyses the effect of policy commitments, research sponsorship and skills on the transparency and credibility of social policy evaluations. His publications cover topics including sampling bias, reporting bias, sponsorship bias and time preferences for evidence.

Dr Thomas J. Leeper. Thomas J. Leeper is an Assistant Professor in Political Behaviour in the Department of Government at the London School of Economics and Political Science. His research, which primarily focuses on the role of information in politics, has been published in American Political Science Review, American Journal of Political Science, Public Opinion Quarterly, and other journals. He has developed over 30 published R software packages as part of the rOpenSci, rOpenGov, and cloudyr development projects. He received his PhD from Northwestern University and was previously a postdoc at Aarhus University.

Course Content

Reproducibility is the ability of an entire study to be duplicated, either by the same researcher or by someone else working independently. As such, reproducibility is one of the main principles of the scientific method.

Although most researchers are committed to the principle of reproducibility, few actually achieve it. By some accounts, only half of studies published in social science journals are reproducible.

This course offers a set of methods to make research more traceable, for the benefit of both:
 Authors, through more efficient and better documented workflows; and
 Research users, including editors, citing authors, knowledge-brokers, through a better understanding of what the authors did and why they did it.

The course follows the research cycle through four key stages:
 Literature review;
 Research protocol;
 Data collection and analysis;
 Reporting.

At each stage, participants will:
 Discuss the ‘gold standard’ of reproducible research;
 Discuss the main risks and obstacles to reproducible research;
 Engage with applied examples of open (and less open) empirical studies;
 Test different apps and software tools such as Git, Knitr, OSF and Dataverse with the aim of streamlining their own workflow.
Examples will be drawn from across the social sciences and students will have the opportunity to work their preferred statistical software (with a strong preference shown for R or Stata).

Course Objectives

Upon completion of the course, students will be able to:
1. Recall and discuss the causes and consequences of irreproducible research;
2. Assess the reproducibility of a given empirical study;
3. Implement transparent and reproducible practices in their own workflows;
4. Apply these skills through the use of open science software and apps.

Course Prerequisites

Our course has been developed to address the specific needs of researchers:
 With a completed BSc (minimum);
 With no or limited prior experience of open science tools/methods;
 Committed to adopt these tools/methods soon after the activity;
 With demonstrated experience in conducting empirical research and analysing quantitative data;
 With a good grasp of the social science literature (economics, political science, psychology, sociology, social work, etc.) or the public health literature;
 Reasonably familiar with Stata or R.

Representative Background Reading

Here are some studies illustrating the issue of irreproducibility in a few disciplines:

 Ioannidis JP (2005) Why most published research findings are false. PLoS Med. 2(8): e124. doi:10.1371/journal.pmed.0020124

In psychology:
 Open Science Collaboration (2015) Estimating the reproducibility of psychological science. Science, 28 Aug 2015: Vol. 349, Issue 6251, DOI: 10.1126/science.aac4716

In economics:
 Bailey DH, Borwein JM, Lopez de Prado M, Qiji JZ (2014) Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance. Notices of the American Mathematical Society, 61(5), May 2014, pp.458-471.
 Chang AC, Li P (2015) Is Economics Research Replicable? Sixty Published Papers from Thirteen Journals Say ”Usually Not”. Finance and Economics Discussion Series 2015-083. Washington: Board of Governors of the Federal Reserve System, http://dx.doi.org/10.17016/FEDS.2015.083.

In political science:
 Esarey J, Wu A (2016) Measuring the effects of publication bias in political science. Research & Politics 3(3). https://doi.org/10.1177/2053168016665856

In health/medical research:
 Iqbal SA, Wallach JD, Khoury MJ, Schully SD, Ioannidis JPA (2016) Reproducible Research Practices and Transparency across the Biomedical Literature. PLoS Biol 14(1): e1002333. doi:10.1371/journal.pbio.1002333
 Begley CG, Ellis LM (2012) Drug development: Raise standards for preclinical cancer research. Nature 483: 531–533.
 Prinz F, Schlange T, Asadullah K (2011) Believe it or not: how much can we rely on published data on potential drug targets? Nat Rev Drug Discov 10: 712.

In social policy research:
 Vaganay A (2016) Outcome Reporting Bias in Government-Sponsored Policy Evaluations: A Qualitative Content Analysis of 13 Studies. PLoS ONE 11(9): e0163702. doi:10.1371/journal.pone.0163702

Required texts

Suggested texts include:
 Manual of Best Practices in Transparent Social Science Research, by Garret Christensen (BITSS, 2016). Available at: http://www.bitss.org/education/manual-of-best-practices/
 The Workflow of Data Analysis Using Stata by J. Scott Long (Stata Press, 2008)
 Reproducible Research with R & RStudio by Christopher Gandrud (Chapman & Hall/CRC, 2013)
 Implementing Reproducible Research edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng (Chapman & Hall/CRC, 2014).
 The Practice of Reproducible Research edited by Justin Kitzes, Daniel Turek, and Fatma Imamoglu (under review at Oxford and UC Press).

Learning objectives

In light of the ongoing ‘replication crisis’ across the disciplines of the social sciences, there is increased pressure on scholars to engage in research practices that are ‘open’ meaning transparent, verifiable, and reproducible. Our course aims to develop the perspectives, knowledge, and skills needed by researchers to make their research more open. Broadly, we will cover the what, why, and how of open science practices. The course follows the research cycle through four key stages:
 Open Mind;
 Open Protocols;
 Open Workflows;
 Open Reports.

Course participants will read materials and hear lectures on the theories and philosophy of openness, engage with applied examples of open (and less open) research, and work hands-on with software tools to develop and apply open research practices to their own scientific workflow. Examples will be drawn from across the social sciences and students will have the opportunity to work their preferred statistical software (with a strong preference shown for R or Stata).

Upon completion of the course, students will be able to:
– Define open science and evaluate the openness of current research;
– Discuss the main drivers and obstacles to openness and critically assess the proposed solutions;
– Implement fundamental open science practices in their own workflows;
– Apply these skills through the use of open science software and apps.

Day 1 Introduction
Lecture content – Duration 1.5hr

– Philosophical underpinnings: Falsifiability [1]; Research norms [2].
– Problem: “Most published research findings are false” [3].
– Professional, economic and social implications of non-transparent/non-replicable research.
– Definitions, scope and objectives.
– Course outline.

Tutorial activities – Duration 2 hrs
– The structure of an empirical study
– The structure of a research folder
– Setting up a project on the Open Science Framework.

Day 2 Open Mind
Lecture content: Duration 1.5hr

– Risk: Being unaware of systemic and personal biases when formulating research questions/hypotheses.
Prevalence/examples: [4,5]
– Possible solutions: Reflexivity, conflict of interest statements, systematic literature reviews.
What we know/assume about their positive and adverse effects [6].

Tutorial activities – Duration 2 hrs
Reflecting on and disclosing conflicts of interests.

Methods and tools for systematic literature reviews:
– Expert searches with the Web of Science; Scopus and PubMed.
– Excel for Literature Reviews.
– Flow charts.

Day 3 Open Protocol
Lecture content – Duration 1.5hr

– Risk: Adjusting research protocols to findings rather than the opposite.
– Prevalence/examples: [7]
– Possible solutions: Protocols, pre-analysis plans, registries.
– What we know/assume about their effect: [8].

Tutorial activities – Duration 2 hrs
– Pre-Analysis Plans vs. Protocols and when to produce them;
– Structure;
– Conducting power analysis.

Pre-registration websites:
– Registering a protocol on the Open Science Framework, RIDIE, ClinicalTrials.gov, etc

Day 4 Open Workflow
Lecture content -Duration 1.5hr

– Risk: Data tampering, “disappearing data”, p-hacking, poor record-keeping
– Prevalence/examples: [9,10].
– Possible solutions: Publishing codes, publishing workflows.
– What we know/assume about their effect.

Tutorial activities – Duration 2 hrs
Open codes:
– Version Control with GitHub
Open data:
– Harvard Dataverse

Day 5 Open Reports
Lecture content – Duration 1.5hr

– Risk: Misreporting essential decisions and findings.
– Prevalence/examples.
– Possible solutions: Transparency badges, Reporting Guidelines, Checklists for graphical excellence and integrity.
– What we know/assume about their effect.

Tutorial activities – Duration 2 hrs
Reporting Guidelines

Data visualization or Dynamic Documents with R Markdown

1. Popper KR. The Logic of Scientific Discovery. Psychology Press; 2002.

2. Merton RK. The Sociology of Science: Theoretical and Empirical Investigations. University of Chicago Press; 1973.

3. Ioannidis JPA. Why most published research findings are false. PLoS Med. 2005;2: e124. doi:10.1371/journal.pmed.0020124

4. Littell JH. Evidence-based or biased? The quality of published reviews of evidence-based practices. Child Youth Serv Rev. 2008;30: 1299–1317. doi:10.1016/j.childyouth.2008.04.001

5. Bero L, Anglemyer A, Vesterinen H, Krauth D. The relationship between study sponsorship, risks of bias, and research outcomes in atrazine exposure studies conducted in non-human animals: Systematic review and meta-analysis. Environ Int. 2015; doi:10.1016/j.envint.2015.10.011

6. Eisner M, Humphreys DK, Wilson P, Gardner F. Disclosure of Financial Conflicts of Interests in Interventions to Improve Child Psychosocial Health: A Cross-Sectional Study. PloS One. 2015;10: e0142803. doi:10.1371/journal.pone.0142803

7. Hannink G, Gooszen HG, Rovers MM. Comparison of Registered and Published Primary Outcomes in Randomized Clinical Trials of Surgical Interventions. Ann Surg. 2013;257: 818–823. doi:10.1097/SLA.0b013e3182864fa3

8. Mathieu S, Boutron I, Moher D, Altman DG, Ravaud P. Comparison of Registered and Published Primary Outcomes in Randomized Controlled Trials. Jama-J Am Med Assoc. 2009;302: 977–984.

9. Vines TH, Albert AYK, Andrew RL, Débarre F, Bock DG, Franklin MT, et al. The availability of research data declines rapidly with article age. Curr Biol CB. 2014;24: 94–97. doi:10.1016/j.cub.2013.11.014

10. Dafoe A. Science Deserves Better: The Imperative to Share Complete Replication Files. Ps-Polit Sci Polit. 2014;47: 60–66. doi:10.1017/S104909651300173X