This lesson is still being designed and assembled (Pre-Alpha version)

Core concepts in reproducibility

Overview

Teaching: 20 min
Exercises: 10 min
Questions
  • What is the reproducibility crisis?

  • How can working Openly and reproducibly help me?

  • What are the key reasons for irreproducible research?

  • How can we solve these key problems in our research?

Objectives
  • Have an overview of issues surrounding the reproducibility crisis and current ways to avoid those pitfalls.

  • Be aware of the tools and practices that are used in your area of research and how they may contribute to the reproducibility crisis.

Motivation

Today we’re going to be building towards a reproducible and open work-flow for research Workflow I’d like to begin by letting you know why it’s important and exactly which problems we’re trying to solve with this work flow.

It’s important that we can trust the results of research because we rely on those results to inform us. We make policies and laws, and choose which avenues of our own research to explore. If research is untrustworthy, that could result in anything from wasting time and money, to causing a PhD student distress, or even costing lives (by influencing policy or drug availability for example).

Aside from the important benefits for society, we also want others to trust our research as it’s connected to our reputation, and our success in our future careers.

What is reproducibility?

The most basic definition of reproducibility is: a research result is reproducible if you can get the same result when you do the research again. Sometimes it’s useful to go one step further and distinguish between different types of reproducible research.

The matrix below (from The Turing Way) gives a useful set of definitions: Reproducibility Matrix

Discussion

In pairs discuss the following.

How often do you think:

  1. research in your area would be reproducible, if you had access to the exact data and analyses?
  2. research in your area would be replicable, if you had access to the exact data and analyses?
  3. research in your area provides access to the exact data and analyses? (3 minutes total)

The reproducibility crisis

The reproducibility crisis is a recent event in science, where scientific results were found to replicate much less than scientists had assumed or hoped that they would.

Less than 40% of replications of well-known Psychology studies were found to reproduce (have significant results): Psychology Replication Results P-values in the original studies were constrained between 0 and just above 0.05, but in the replications, p-values ranged from 0 to 1 (with less than 40% being less than 0.05). Effect sizes were noticably lower, too.

Similar results were found in Cancer Biology (where only 11% of results replicated). Begley, C. Glenn, and Lee M. Ellis. “Drug development: Raise standards for preclinical cancer research.” Nature 483.7391 (2012): 531.

Not only that, but most researchers agree that there is a problem, and most have failed to reproduce a result.

Most researchers have failed to reproduce a result

Of the 1576 scientists surveyed, over 70% of scientists surveyed have experienced failure to reproduce other’s results and over 50% have failed to reproduce their own results.

Reminder:

  • P-values measure the probability of getting as convincing a result by chance assuming that there is no effect - they range between 0 and 1. The commonly used threshold for a “statistically significant” result is p < 0.05.
  • Effect Sizes measure the strength of a relationship between two variables. There are different types of effect size, for example Cohen’s D or the Pearson corellation coefficient.

Why did it take us so long to notice?

Un”FAIR” data and analyses

It used to be harder to share and access data and analyses. FAIR principles describe how data/analyses need to be stored in order for them to be used by others:

If data is messy, unlabelled, in a strange file format, or only on your hard drive, then no one can check if the result is correct.

Why is it happening?

There are lots of points in the process of doing research where something can go wrong, but lots of high profile examples (leading to retractions) come down to:

  1. Data storage mistakes, e.g. accidentally deleting columns of excel files, or rewriting important values.
  2. Data analysis implementation mistakes, e.g. accidentally clicking through a GUI in the wrong order, or code not doing what you think it does.
  3. Questionable research practices, e.g. p-hacking and HARKing

P-hacking

P-hacking is a catch-all term for making p-values appear smaller than they are. It includes practices like:

P-curves show evidence that this is occurring in research, particularly when p-values are close to 0.05. p-curves show evidence of p-hacking in research

HARKing

HARKing stands for “Hypothesising After Results are Known”.

The xkcd jellybean comic illustrates it well: HARKing

Discussion: Barriers to reproducibility in your field.

We’ve mentioned some barriers to reproducibility so far. In small groups, first write down the issues you remember on the paper in front of you in one colour. Then discuss how you think these reproducibility problems mentioned might affect your fields. (10 minutes total)

Solutions

Make data available

By making your data available, you can ensure that you and others will be able to replicate your work. This can involve:

Whatever you make available, it must be well labelled and described if it’s going to be useful for future you, or for others.

Make analysis available

Scripts make analysis available by writing scripts, they:

Literate programming = scripts + describing what’s happening

Ways to do it:

Much of the time this is carried out in markdown. Markdown is a way of turning simple formatting into html (to be shared on the web). It’s the language of RMarkdown and GitHub.

Version Control

Version control is a system for avoiding these problems.

Pre-registration

The first principle is you must not fool yourself — and you are the easiest person to fool - Richard Feynman

If you are doing hypothesis-confirming research, consider pre-registering your analysis is saying what analysis you are going to do in advance, including:

This prevents you from accidentally trying out more hypothesis than you meant to, and shows other people that you didn’t.

Registered reports

Registered reports are a new model for publishing papers. Instead of applying to journals once you have your results, you apply while after you have made plans for research and analysis, but before you have collected your data.

Extra great for research(ers) because:

Discussion: how do these core concepts of reproducibility relate to our workflow today?

In small groups, looking at our workflow for today, try to list which of the following reproducibility problems are being prevented during each step of the workflow: Workflow Data or code being unFAIR (findable, accessible, interoperable, and reusable - which of these is it?)

Data storage mistakes (e.g. deleting rows of spreadsheets)

Data analysis mistakes (e.g. you didn’t do what you meant to do)

Questionable research practices (e.g. p-hacking)

Suggestions

  1. FAIR (Accessible and Reusable) data and code (well-organised)
  2. Prevents data analysis mistakes
  3. FAIR (Findable) code
  4. Prevents questionable research practices
  5. FAIR (Accessible and Reusable) code
  6. Prevents data analysis mistakes 7 + 8. FAIR (Findable and Accessible) results, prevents data analysis mistakes.

Five selfish reasons to work reproducibly

We do want to trust science, and to do good work, and to not waste people’s time, and all of the other selfless reasons to work reproducibly, but there are also good selfish reasons to do so:

  1. Reproducibility helps to avoid disaster
  2. Reproducibility makes it easier to write papers/your thesis
  3. Reproducibility helps reviewers see it your way
  4. Reproducibility enables continuity of your work
  5. Reproducibility helps to build your reputation

These reasons are taken from Florian Markowetz’s paper.

Additional benefits of working “Openly”

In addition to making sure that science is working (being reproducible), this workflow is about working Openly. What I mean by that in this context, is sharing your plans, code, results, etc, as you go through the process, and not just at the end.

So, on top of the selfish benefits of working reproducibly working Openly is also beneficial for:

Key Points

  • The results of published research don’t always reproduce.

  • Make your data and analysis FAIR (Findable, Accessible, Interoperable and Reusable); as apen as possible, and as closed as necessary

  • Ask for help if you need statistics or software support: research software engineers, your statistics department.