SPSP recently published a statement on scientific progress which began “Science advances largely by correcting errors, and scientific progress involves learning from mistakes. By eliminating errors in methods and theories, we provide a stronger evidentiary basis for science that allows us to better describe events, predict what will happen, and solve problems” (SPSP Board of Directors, 2016). [Cited from Crandall et al., 2018, PSPB, Editorial]
A popular twitter account is called “Shit Academics Say” and it often posts funny commentaries on academia (see example above).
I borrow the phrase “Shit Academics Say ” for this post about shit social psychologists say with a sense of authority and superiority. Social psychologists see themselves as psychological “scientists,” who study people and therefore believe that they know people better than you or me. However, often their claims are not based on credible scientific evidence and are merely personal opinions disguised as science.
For example, a popular undergraduate psychology textbook claims that “Hitler had high self-esteem.” quoting an article that has been cited over 500 times in the journal “Psychological Science in the Public Interest” (although the title suggests it is written for the general public, it is mostly read by psychologists and the title is supposed to create the illusion that they are actually doing important work that serves the public interest.
At the end of the article with the title “Does High Self-Esteem Cause Better Performance, Interpersonal Success, Happiness, or Healthier Lifestyles?” the authors write:
“High self-esteem feels good and fosters initiative. It may still prove a useful tool to promote success and virtue, but it should be clearly and explicitly linked to desirable behavior. After all, Hitler had very high self-esteem and plenty of initiative, too, but those were hardly guarantees of ethical behavior.”
In the textbook this quote is linked to boys who engage in sex at an “inappropriately young age” which is not further specified (in Canada this would be 14) according to recent statistics).
“High self-esteem does have some benefits—it fosters initiative, resilience, and pleasant feelings (Baumeister & others, 2003). Yet teen males who engage in sexual activity at an “inappropriately young age” tend to have higher than average self-esteem. So do teen gang leaders, extreme ethnocentrists, terrorists, and men in prison for committing violent crimes (Bushman & Baumeister, 2002; Dawes, 1994, 1998). “Hitler had very high self-esteem,” note Baumeister and co-authors (2003).” (Myers, 2011, Social Psychology, 12th edition)
Undergraduate students pay (if they pay; hopefully they do not) $200 to be informed that people with high self-esteem are like sexually deviants, terrorists, violent criminals, and Hitler. (maybe we should add scientists with flair to the list).
The problem is that this is not even true. Students who work with me on fact checking the textbook found this quote in the original article.
“There was no [!] significant difference in self-esteem scores between violent offenders and non-offenders, Ms = 28.90 and 28.89, respectively, t(7653) = 0.02, p > .9, d = 0.0001.”
[Technical detail you can skip. Although the df of the t-test look impressive, the study compared 63 violent offenders to 7590 unmatched, mostly undergraduate student (gender not specified, probably mostly female) participants. So the sampling error of this study is high and the theoretical importance of comparing these two groups is questionable.
How Many Correct Citations Could be False Positives?
Of course, the example above is an exception. Most of the time a cited reference contains an empirical finding that is consistent with the textbook claim. However, this does not mean that textbook findings are based on credible and replicable evidence. Even a Noble Laureate was conned by flashy findings in small samples that could not be replicated (Train Wreck: Fast & Slow).
Until recently it was common to assume that statistical significance ensures that most published results are true positives (i.e, not a false positive random finding). However, this is only the case if all results are reported. It has been known since 1959 that this is not the case in psychology (Sterling, 1959). Psychologists selectively publish only results that support their theories. This practice disables the significance-filter that is supposed to keep false positives out of the literature. The claim that results published in social psychology journals were obtained with rigorous research (Crandall et al., 2018) is as bogus as Volkswagen’s Diesel tests, and the future of social psychology may be as bleak as the future of Diesel engines.
Jerry Brunner and I developed a statistical tool that can be used to clean up the existing literature. Rather than actually redoing 50 years of research, we use the statistical results reported in original studies to apply a significance filter post-hoc. Our tool is called zcurve. Below I used zcurve to examine the replicability of studies that were used in the chapter that also included the comparison of sexually active teenagers with violent criminals, terrorists, and Hitler.
More detailed information about the interpretation of the graph above is provided elsewhere (link). In short, for each citation in the textbook chapter that is used as evidence for a claim, a team of undergraduate students retrieved the cited article and extracted the main statistical result that matches the textbook claim. These statistical results are then converted into a z-score that reflects the strength of evidence for a claim. Only significant results are important because non-significant results cannot support an empirical claim (although sometimes non-significant results are falsely used to support claims that there is no effect).
Zcurve fits a model to the (density) distribution of significant z-scores (z-scores > 1.96). The shape of the density distribution provides information about the probability that a randomly drawn study from the set would replicate (i.e., reproduce a significant result). The grey line shows the predicted distribution by zcurve. It matches the observed density in dark blue well. Simulation studies show good performance of zcurve. Zcurve estimates that the average replicability of studies in this chapter is 56%. This number would be reassuring if all studies had 56% power. This would mean that all studies are true positives and if a study were replicated every other study would be successful.
However, reality does not match this rosy scenario. In reality, studies vary in replicability. Studies with z-scores greater than 5 have 99% replicability (see numbers below x-axis). However, studies with just significant results (z < 2.5) have only 21% replicability. As you can see, there are a lot more studies with z < 2.5 than studies with z > 5. So there are more studies with low replicability than studies with high replicability.
The next plot shows model fit (higher numbers = worse fit) for zcurve models with a fixed proportion of false positives. If the data are inconsistent with a fixed proportion of false positives, model fit decreases (higher numbers).
The graph shows that models with 100%, 90% or 80% false positives clearly do not fit the data as well as models with fewer false positives. This shows that some textbook claims are based on solid empirical evidence. However, model fit for models with 0% to 60% look very similar. Thus, it is possible that the majority of claims in the self chapter of this textbook are false positives.
It is even more problematic that textbook claims are often based on a single study with a student sample at one university. Social psychologists have warned repeatedly that their findings are very sensitive to minute variations in studies, which makes it difficult to replicate these effects even under very similar conditions (Van Bavel et al., 2016), and that it is impossible to reproduce exactly the same experimental conditions (Stroebe and Strack, 2014). Thus, the zcurve estimate of 56% replicability is a wildly optimistic estimate of replicability in actual replication studies. In fact, the average replicability of studies in social psychology is only 25% (Open Science Collaboration, 2015).
While social psychologists are currently outraged about a psychologist with too many self-citations, they are silent about the crimes against science that have been committed by social psychologists that produced pseudo-scientific comparisons of sexually active teenagers with Hitler and questionable claims that suggest high self-esteem is a sign of pathology. Maybe social psychologists should spend less time criticizing others and spend more time reflecting on their own errors.
In official statements and editorials, social psychologists are taking the right talk.
However, they are still not walking the walk. Seven years ago, Simmons et al. (2011) published an article called psychology “False Positive Psychology” that shocked psychologists and raised concerns about the credibilty of textbook findings. One year later, Nobel Laureate Daniel Kahneman wrote an open letter to star social psychologist John Bargh to clean up social psychology. Nothing happened. Instead, John Bargh published a popular book in 2017 that does not mention any of the concerns about the replicabilty of social psychology in general or his work in particular. Denial is no longer acceptable. It is time to walk the walk and to get rid of pseudo-science in journals and in textbooks.
Hey its spring. What better time to get started with a major house cleaning.