In a much-discussed New York Times article, psychologist Lisa Feldman Barrett claimed, “Psychology is not in crisis.” She was responding to the results of a large-scale initiative called the Reproducibility Project, published in Science magazine, which appeared to show that the results from over 60% of a sample of 100 psychology studies did not hold up when independent labs attempted to replicate them. In this talk, I address three major issues:
What did the Reproducibility Project really show, and in what specific sense can the follow-up studies meaningfully be described as “failures to replicate” the original findings? I argue that, contrary to what many are suggesting, very little can be learned about the validity of the original studies based upon a single (apparent) failure to replicate: instead, multiple replications (of sufficient quality) of each contested experiment would be needed before any strong conclusions could be drawn about the appropriate degree of confidence to be placed in the original findings. To make this point I draw on debates over falsification in philosophy of science, paying special attention to the role of auxiliary assumptions in falsifying claims or theories. Is psychology in crisis or not? And if so, what kind of crisis? I tease apart two senses of crisis here. The first sense is “crisis of confidence,” which is a descriptive or sociological claim referring to the notion that many people, within the profession and without, are as a matter of fact experiencing a profound and, in some ways, unprecedented lack of confidence in the validity of the published literature. This is true not only in psychology, but in other fields such as medicine as well. Whether these people are justified in feeling this way is a separate but related question, and the answer depends on a number of factors, to be discussed. The second sense of “crisis” is “crisis of process” – i.e., the notion that (due in large part to apparent failures to replicate a substantial portion of previously published findings), psychological science is “fundamentally broken,” or perhaps not even a “true” science at all. This notion would be based on the assumption that most or perhaps even all of the findings in a professionally published literature should “hold up” when they are replicated, in order for a discipline to be a “true” science, or not to be in a state of “crisis” in this second sense. But this assumption, I will argue, is erroneous: failures of various sorts in science, including bona fide failures to replicate published results, are often the wellspring of important discoveries and other innovations. Therefore, (apparent) replication failure, even on a wide scale, is no evidence that science/psychology is broken, per se. Nevertheless, This does not mean that there is not substantial room for serious, even radical improvements to be made in the conduct of psychological science. In fact, the opposite is true. Even setting the Reproducibility Project findings aside, there was already substantial—and more direct—evidence that current research and publication practices in psychology and other disciplines were and are systematically flawed, and that the published literature had and has a high likelihood of containing a large proportion of false “findings” and erroneous conclusions. Problems that need urgently to be addressed include: publication bias against “negative” results, the related “file drawer” problem, sloppy statistics and lack of adequate statistical training among many scientists, small sample sizes, inefficient and arbitrary peer review, and so on.