“Promise, peril, and perspective: Addressing concerns about reproducibility in social–personality psychology”
Journal of Experimental Social Psychology 66 (2016) 148–152
a.k.a The Swan Song of Social Psychology During the Golden Age
Disclaimer: i wrote this piece because Jamie Pennebeker recommended writing as therapy to deal with trauma. However, in his defense, he didn’t propose publishing the therapeutic writings.
You might think an article with reproducibiltiy in the title would have something to say about the replicability crisis in social psychology. However, this article has very little to say about the causes of the replication crisis in social psychology and possible solutions to improve replicability. Instead, it appears to be a perfect example of repressive coping to avoid the traumatic realization that decades of work were fun, yet futile.
The authors start with a very sensible suggestion. “We propose that the goal of achieving sound scientific insights and useful applications will be better facilitated over the long run by promoting good scientific practice rather than by stressing the need to prevent any and all mistakes.” (p. 149). The only question is how many mistakes we consider tolerable and that we do not know what the error rates are. Rosenthal pointed out it could be 100%, which even the authors might consider to be a little bit too high.
2. Improving research practice”
In this chapter, the authors suggest that “if there is anything on which all researchers might agree, it is the call for improving our research practices and techniques.” (p. 149). If this were the case, we wouldn’t see articles in 2016 that make statistical mistakes that have been known for decades like pooling data from a heterogeneous set of studies or computing difference scores and using one of the variables as a predictor of the difference score.
It is also puzzling to read “the contemporary literature indicates just how central methodological innovation has been to advancing the field” (p. 149), when the key problem of low power has been known since 1962 and there is still no sign of improvement.
The authors also are not exactly in favor of adapting better methods, when these methods might reveal major problems in older studies. For example, a meta-analysis in 2010 might not have examined publication bias and produced an effect size of more than half a standard deviation, when a new method that controls for publication bias finds that it is impossible to reject the null-hypothesis. No, these new methods are not welcome. “In our view, they will stifle progress and innovation if they are seen primarily through the lens of maladaptive perfectionism; namely as ways of rectifying flaws and shortcomings in prior work.” (p. 149). So, what is the solution. Let’s pretend that subliminal priming made people walk slower in 1996, but stopped working in 2011?
This ends the chapter of improving research practice. Yes, that is the way to deal with a crisis. When the city is bankrupt, cut back on the Christmas decorations. Problem solved.
3. How to think about replications
Let’s start with a trivial statement that is as meaningless as saying, we would welcome more funding. “Replications are valuable.” (p. 149). Let’s also not mention that social psychologists have been the leader of requesting replication studies. No single study article shall be published in a social psychology journal. A minimum of three studies with conceptual replications of the key finding are needed to show that the results are robust and always produce significant results with p < .05 (or at least p < .10). Yes, no other science has cherished replications as much as social psychology.
And eminent social psychologists Crandall and Sherman explain why. “to be a cumulative
and self-correcting enterprise, replications, be their results supportive, qualifying, or contradictory, must occur.” Indeed, but what explains the 95% success rate of published replications in social psychology. No need for self-correction, if the predictions are always confirmed.
Surprisingly, however, since 2011 a number of replication studies have been published in obscure journals that fail to replicate results. This has never happened before and raises some concerns. What is going on here? Why can these researchers not replicate the original results? The answer is clear. They are doing it wrong. “We concur with several authors (Crandall and Sherman, Stroebe) that conceptual replications offer the greatest potential to our field… Much of the current debate, however, is focused narrowly on direct
or exact replications.” (p. 149). As philosopher know, you cannot step into the same river twice and so you cannot replicate the same study again. To get a significant result, you need to do a similar, but not an identical replication study.
Another problem with failed replication studies is that these researchers assume that they are doing an exact replication study, but do not test this assumption. “In this light, Fabrigar’s insistence that researchers take more care to demonstrate psychometric invariance is well-placed” (p. 149). Once more, the superiority of conceptual replication studies is self-evident. When you do a conceptual replication study, psychometric invariance is guaranteed and does not have to be demonstrated. Just one more reason, why conceptual replication studies in social psychology journals produce 95% success rate, whereas misguided exact replication attempts have failure rates of over 50%.
It is also important to consider the expertise of researchers. Social psychologists often have demonstrated their expertise by publishing dozens of successful, conceptual replications. In contrast, failed replications are often produced by novices with no track-record of ever producing a successful study. These vast differences in previous success rate need to be taken into account in the evaluation of replication studies. “Errors caused by low expertise or inadvertent changes are often catastrophic, in the sense of causing a study to fail completely, as Stroebe nicely illustrates.”
It would be a shame if psychology would start rewarding these replication studies. Already limited research funds would be diverted to conducting studies that are easy to do, yet to difficult to do correctly for inexperienced researchers away from senior researchers who do difficult novel studies that always work and produced groundbreaking new insights into social phenomena during the “golden age” (p. 150) of social psychology.
The authors also point that failed studies are rarely failed studies. When these studies are properly combined with successful studies in a meta-analysis, the results nearly always show the predicted effect and that it was wrong to doubt original studies simply because replication studies failed to show the effect. “Deeper consideration of the terms “failed” and “underpowered” may reveal just how limited the field is by dichotomous thinking. “Failed” implies that a result at p = .06 is somehow inferior to one at p = .05, a conclusion
that scarcely merits disputation.” (p. 150).
In conclusion, we learn nothing from replication studies. They are a waste of time and resources and can only impede further development of social psychology by means of conceptual replication studies that build on the foundations laid during the “golden age” of social psychology.
4. Differential demands of different research topics
Some studies are easier to replicate than others, and replication failures might be “limited to studies that presented methodological challenges (i.e., that had protocols that were considered difficult to carry out) and that provided opportunities for experimenter bias” (p. 150). It is therefore better, not to replicate difficult studies or to let original authors with a track-record of success conduct conceptual replication studies.
Moreover, some people have argued that the high succeess rate of original studies is inflated by publication bias (not writing up failed studies) and the use of questionable research practices (run more participants until p < .05). To ensure that reported successes are real successes, some initiatives call for data sharing, pre-registration of data analysis plans, and a priori power analysis. Although these may appear to be reasonable suggestions, the authors disagree. “We worry that reifying any of the various proposals as a “best practice” for research integrity may marginalize researchers and research areas that study phenomena or use methods that have a harder time meeting these requirements.” (p. 150).
They appear to be concerns that researchers who do not preregister data analysis plans or do not share data may be stigmatized. “If not, such principles, no matter how well-intentioned, invite the possibility of discrimination, not only within the field but also by decision-makers who are not privy to these realities.” (p. 150).
5. Considering broader implications
These are confusing times. In the old days, the goal of research was clearly defined. Conduct at least three, loosely related , successful studies and write them up with a good story. During these times, it was not acceptable to publish failed studies to maintain the 95% success rate. This made it hard for researchers who did not understand the rules of publishing only significant results. “Recently, a colleague of ours relayed his frustrating experience of submitting a manuscript that included one null-result study among several studies with statistically significant findings. He was met with rejection after rejection, all the while being told that the null finding weakened the results or confused the manuscript” (p. 151).
It is not clear what researchers should be doing now. Should they now report all of their studies, the good, the bad, and the ugly, or should they continue to present only the successful studies? What if some researchers continue to publish the good old fashioned way that evolved during the golden age of social psychology and others try to publish results more in accordance with what actually happened in their lab? “There is currently, a disconnect between what is good for scientists and what is good for science” and nobody is going to change while researchers who report only significant results get rewarded with publications in top journals.
There may also be little need to make major changes. “We agree with Crandall and Sherman, and also Stroebe, that social psychology is, like all sciences, a self-correcting enterprise” (p. 151). And if social psychology is already self-correcting, it do not need new guidelines how to do research and new replication studies. Rather than instituting new policies, it might be better to make social psychology great again. Rather than publishing means and standard deviations or test statistics that allow data detectives to check results, it might be better to report only whether a result was significant, p < .05, and because 95% of studies are significant and the others are failed studies, we might simply not report any numbers. False results will be corrected eventually because they will no longer be reported in journals and the old results might have been true even if they fail to replicate today. The best approach is to fund researchers with a good track record of success and let them publish in the top journals.
Most likely, the replication crisis only exists in the imagination of overly self-critical psychologists. “Social psychologists are often reputed to be among the most severe critics of work within their own discipline” (p. 151). A healthier attitude is to realize that “we already know a lot; with these practices, we can learn even more” (p. 151).
So, let’s get back to doing research and forget this whole thing that was briefly mentioned in the title called “concerns about reproducibility.” Who cares that only 25% of social psychology studies from 2008 could be replicated in 2014. In the meantime, thousands of new discoveries were made and it is time to make more new discoveries. “We should not get so caught up in perfectionistic concerns that they impede the rapid accumulation and dissemination of research findings” (p. 151).
There you have it folks. Don’t worry about recent failed replications. This is just a normal part of science, especially a science that studies fragile, contextually sensitive phenomena. The results from 2008 do not necessarily replicate in 2014 and the results from 2014 may not replicate in 2018. What we need is fewer replications. We need permanent research because many effects may disappear the moment they were discovered. This is what makes social psychology so exciting. If you want to study stable phenomena that replicate decade after decade you might as well become a personality psychologist.