Reading “Error Tight” by Julia Strand
Today in our lab meeting we read “Error Tight: Exercises for Lab Groups to Prevent Research Mistakes” by Julia Strand. It’s a great paper and led to some really fun discussions!
The paper makes the point that errors are a necessary feature of human activity, including science, and that there are well-established methods from human factors research to reduce the frequency and impact of errors, which she refers to as “safety culture.” If you wonder why airplanes don’t crash very often, safety culture is an important part of that. Julia lays out a number of learnings from this work that we can bring to bear on making our science better. I think that our lab has already incorporated a lot of these ideas into our everyday workflows, and in particular we have tried to make the lab a “blame-free zone” for reporting errors, as we discussed previously. Here were a few points that came up in our discussion.
There is an inherent tradeoff between agility and documentation. One wants to document everything important and nothing more, since documentation takes time and effort; the question is what is important to document? It wasn’t clear that the proprosed log in Table 1 was at exactly the right level. I like the analogy to code commenting: One shouldn’t need to comment on things that are clear from the code itself. Rather, one should comment on the intention and rationale of the code, and try to tell the reader everything that the writer would want them to know about it that isn’t self-evident from reading the code. But it’s a difficult line to draw, and in general we liked the ideas that were laid out in the paper.
A second long discussion centered on the idea of a “blame-free” zone - which has been the explicit policy in our lab for a while. There was some concern that this might lead to moral hazard, where people don’t worry about making mistakes because they know there will not be any adverse consequences. My feeling is that we need to separate the system aspects from the individual aspects. From a system level, we want to know about errors as early as possible, which is incentivized by the blame-free system. However, we need to simultaneously build the expectation that people should try to minimize errors, and an understanding that repeated errors are a signal of a problem that may have consequences if it’s not remediated.
Another point that was raised was that there is almost necessarily a tradeoff between speed and accuracy in science just as there is in human performance more generally. Thus, we need to realize that reducing the likelihood of errors will necessarily result in slower science. In some cases (e.g. having multiple people analyze each dataset independently) it was felt by some in the group that this might be too much of a cost for not enough benefit. We definitely need to think about how to calibrate the level of effort/time spent to the likelihood and costs of any potential errors in each domain.
We had a lively discussion around the distinction between errors and fraud. If someone were to commit fraud then we would almost certainly not want to take a blame-free approach: we must have zero tolerance for misconduct. But what counts as misconduct? Are QRPs like p-hacking different from intentional falsification? And does that depend on whether one actually knows that the QRP’s are bad? There was also some disagreement in the group about whether intention actually matters from the standpoint of the system. Does a QRP go from a regrettable mistake to intention misconduct once one knows that it’s a questionable practice. Can’t say we came up with any great answers to these questions.
I’ll add here that I have generally tried to avoid discussing fraud in the context of reproducibility. While I know that intentional fraud (specifically, falsification of data or results) happens, I am not sure that there is anything that we could do about it that wouldn’t end up making science worse, and I’m pretty sure that someone who wants to commit fraud will always find a way around whatever safeguards we put in place anyway. I have always worked on the assumption that my fellow scientists are well-intentioned, in part because I don’t think I could go on doing science if I believed otherwise. The goal of our work has always been to provide people with the tools and knowledge they need to do the best science they can, and to work to align the incentives so that the best science is rewarded (e.g. through the HELIOS initiative).
Overall we thought this was a great paper and would strongly recommend it for other labs to read!