Many of you may have heard of the recent psychology replication study, published in Science, in which researchers attempted to replicate 100 hand-picked psychology studies, and were only able to successfully replicate 39 of them.
I am a huge fan of this study, among other things because it encourages other scientists to attempt replication (which everyone agrees is not done enough in the sciences). The result also opens up a bunch of cool interpretive questions about scientific method and statistical analysis.
One obvious mistake to avoid is concluding based on this study that only 39% of the original studies were "correct," in some sense of the word. Just as some of the initial 100 studies probably really were flawed and gave misleading results (which I believe can be thought of as "false positives" without being too misleading), this is also probably true of some of the failed replications as well (which can analogously be thought of as "false negatives").
But can we get more precise with the implications, even to a first approximation? I have an amateur interest in philosophy of science, but am wholly ignorant of experimental design and statistical analysis. So I could use a hand (hence this post).
Can we use Bayesian theory to get some clarity? Of course, we are going to have choose some semi-arbitrary numbers, like the probability that each of the initial findings is a false positive, the probability that each of the initial findings is a false negatives, and the probabilities that each of the attempted replications is a false positive or a false negative.
Apart from the general probability of false positives and false negatives with both the initial findings and the attempted replications, there are more particular factors to consider. One is the expertise of the experimenters; replication may be difficult, because specialized skills and practice may be necessary to successfully create the controlled conditions which will show the initial experimental result. There is also the obvious question of confirmation bias among the researchers who authored the initial studies. And so on.