The decline effect predicts that effects become weaker over time. It has been proposed as a viable explanation for the replication crisis (Lehrer, 2010). However, evidence for the decline effect has been elusive (Schooler, 2011). One major problem, at least in psychology, is that researchers rarely conduct exact replication studies of the original studies. However, in recent years, psychologists have started to conduct Registered Replication Reports. An original study is replicated by several labs as closely as possible to the original study. This makes it possible to examine the decline effect. The decline effect predicts that original studies have larger effect sizes than replication studies.
One problem is that studies often have small samples and large sampling error. This makes it difficult to interpret observed effect sizes. One solution to this problem is to focus on the relative extremity of an effect size relative to effect sizes in replication studies. According to the decline effect, effect sizes in original studies should be higher than effect sizes in replication studies. In the most extreme case, the original study would have the largest effect size. If there were 20 studies with identical effect sizes, the probability that the original study reported the strongest effect is only 1/20 = .05.
I ordered all effect sizes from the original study and replication studies in decreasing order of effect sizes. I then recorded the rank of the original study. R-Code: which(c(1:length(d)) [order(d,decreasing=TRUE)] == 1)# 1 = number of original study.
The results are shown in Table 1. For 5 out of 6 RRRs, the original study reported the largest effect size. In all of these RRRs, all of the replication studies failed to replicate a significant effect. Only the second verbal overshadowing RRR produced conclusive evidence for an effect. Yet, the effect size reported in the original study was still the third largest out of 24 studies. These results provide strong support for the decline effect.
To examine whether this pattern of results could have occurred by chance, I computed the probability of this outcome under the null-hypothesis that all studies have the same population effect size . The chance of drawing the original study on the first draw is 1/n with n = number of studies. The probabilities are very low. For the verbal overshadowing RRR2, the probability of drawing the original study on the third draw is .12 (1 – 23*22*21/(24*23*22)). A meta-analysis of the six probabilities with Stouffer’s method provides strong evidence against the null-hypothesis, z = 3.8, p < .0001.
|VerbalOvershadowing RRR1||1 out of 33||p = .03|
|VerbalOvershadowing RRR2||3 out of 24||p = .12|
|Ego-depletion:||1 out of 24||p = .04|
|ImperfectAction||1 out of 13||p = .08|
|CommitmentForgiveness||1 out of 17||p = .06|
|Facial Fedback||1 out of 18||p = .06|
|Combined||1 out of 14,1222||p = 0.00007|
A test of the decline effect with the data from all Registered Replication Reports provides strong evidence for the hypothesis that effect sizes of original studies are larger and decrease over time.
YThe same holds for ego-depletion. Initially, performing a difficult task led to a reduction in effort on a second task. But collective consciousness about this effect means that participants are aware of this effect and compensate for it by working harder. This theory is consistent with the fact that the decline effect is pervasive in social psychology, but not in other sciences. For example, the effect of eating cheesecake on weight gain has unfortunately not decreased as the obesity epidemic shows. Also computers are getting faster not slower. Thus, not all cause-effect relationships decline over time.
It is only cause-effect relationship of mental processes where collective consciousness can moderate the strength of cause-effect relationships. Thus, the collective consciousness hypothesis suggests that the replication crisis in psychology is not a replication crisis, but actually a real phenomenon. The original studies did make a real discovery but ironically the discovery made the effect disappear.
This study has a number of limitations and there are alternative explanations for the finding that seminal articles report stronger effect sizes. One possibility is regression to the mean (Fiedler). Regression to the mean implies that an observed effect size in a small sample will not replicate with the same effect size. The next study is more likely to produce a result that is closer to the mean. The problem with this hypothesis is that it does not explain why the mean of replication studies is often very close to zero. Thus, it fails to explain the mysterious disappears of effects and the elusive nature of findings in social psychology that makes the decline effect so interesting.
Another possible explanation is publication bias. Maybe researchers are simply publishing results that are consistent with their theories and they do not publish disconfirming evidence (Sterling, 1959). However, this explanation does not explain the fact that at the time of the original studies other studies reported successful results. In fact, many of the RRR studies were taken from articles that reported several successful studies. The failure to replicate the effect occurred only several years later when there was sufficient time for collective consciousness to make the effect disappear.
Finally, Schooler (personal communication 2012) proposed an interesting theory. Astrophysicists have calculated that it is very likely that other intelligent live evolved in other parts of the universe way before human evolution. Like humans now, these intelligent life forms were getting increasingly bored with their limited reality and started building artificially simulated virtual worlds and enjoyed this virtual world to entertain themselves. At some point, agents in these games were given the illusion of self-consciousness that they are real agents with their own goals, feelings, and thoughts. According to this theory, we are not real agents, but virtual agents in a computer game of a much more intelligent life form. Although the simulation software works very well, there are some bugs and glitches that make the simulation behave in strange ways. Often the simulated agents do not notice this, but clever experiments by parapsychologists (Bem, 2011) can sometimes reveal these inconsistencies. Many of the discoveries in social psychology are also caused by these glitches. The effects can be observed for some time, but then a software update makes them disappear. This theory would also explain why original results disappeared in replication studies.
It is difficult to distinguish empirically between the collective consciousness hypothesis and the simulated-world hypothesis. However, the two theories make different predictions about findings that do not enter collective consciousness. A researcher could conduct a study, but not analyze the data, and replicate the study 10 years later. Only then the results of the two studies are analyzed. The collective consciousness hypothesis predicts that there will be no decline effect. The simulated-world hypothesis predicts that the decline effect will emerge. Of course, a single original study is most likely to show no effect because it is very difficult to find original effects that are subject to the decline effect. Thus, it requires many studies that will not show any effect, but when original studies show an effect, it will be very interesting to see whether they replicate. If they do not replicate, it provides evidence for the simulated-world hypothesis that we are just simulated agents in a computer game of a life-form much more intelligent than we think we are. So, I propose that social psychologists plan a series of carefully planned time-lagged replication studies to answer the most fundamental question of humanity. Do we really exist because we think we do or is it all a big illusions?