William A. Cunningham, Kristopher J. Preacher, and Mahzarin R. Banaji. (2001).
Implicit Attitude Measures: Consistency, Stability, and Convergent Validity, Psychological Science, 12(2), 163-170.
In recent years, several techniques have been developed to measure implicit social cognition. Despite their increased use, little attention has been devoted to their reliability and validity. This article undertakes a direct assessment of the interitem consistency, stability, and convergent validity of some implicit attitude measures. Attitudes toward blacks and whites were measured on four separate occasions, each 2 weeks apart, using three relatively implicit measures (response window evaluative priming, the Implicit Association Test, and the response-window Implicit Association Test) and one explicit measure (Modern Racism Scale). After correcting for interitem inconsistency with latent variable analyses, we found that (a) stability indices improved and (b) implicit measures were substantially correlated with each other, forming a single latent factor. The psychometric properties of response-latency implicit measures have greater integrity than recently suggested.
Critique of Original Article
This article has been cited 362 times (Web of Science, January 2017). It still is one of the most rigorous evaluations of the psychometric properties of the race Implicit Association Test (IAT). As noted in the abstract, the strength of the study is the use of several implicit measures and the repeated measurement of attitudes on four separate occasions. This design makes it possible to separate several variance components in the race IAT. First, it is possible to examine how much variance is explained by causal factors that are stable over time and shared by implicit and explicit attitude measures. Second, it is possible to measure the amount of variance that is unique to the IAT. As this component is not shared with other implicit measures, this variance can be attributed to systematic measurement error that is stable over time. A third variance component is variance that is shared only with other implicit measures and that is stable over time. This variance component could reflect stable implicit racial attitudes. Finally, it is possible to identify occasion specific variance in attitudes. This component would reveal systematic changes in implicit attitudes.
The original article presents a structural equation model that makes it possible to identify some of these variance components. However, the model is not ideal for this purpose and the authors do not test some of these variance components. For example, the model does not include any occasion specific variation in attitudes. This could be because attitudes do not vary over the one-month interval of the study, or it could mean that the model failed to specify this variance component.
This reanalysis also challenges the claim by the original authors that they provided evidence for a dissociation of implicit and explicit attitudes. “We found a dissociation between implicit and explicit measures of race attitude: Participants simultaneously self-reported nonprejudiced explicit attitudes toward black Americans while showing an implicit difficulty in associating black with positive attributes” (p. 169). The main problem is that the design does not allow to make this claim because the study included only a single explicit racism measure. Consequently, it is impossible to determine whether unique variance in the explicit measure reflects systematic measurement in explicit attitude measures (social desirable responding, acquiescence response styles) or whether this variance reflects consciously accessible attitudes that are distinct from implicit attitudes. In this regard, the authors claim that “a single-factor solution does not fit the data” (p. 170) is inconsistent with their own structural equation model that shows a single second-order factor that explains the covariance among the three implicit measures and the explicit measure.
The authors caution that a single IAT measure is not very reliable, but their statement about reliability is vague. “Our analyses of implicit attitude measures suggest that the degree of measurement error in response-latency measures can be substantial; estimates of Cronbach’s alpha indicated that, on average, more than 30% of the variance associated with the measurements was random error.” (p. 160). More than 30% random measurement error leaves a rather large range of reliability estimates ranging from 0% to 70%. The respective parameter estimates for the IAT in Figure 4 are .53^2 = .28, .65^2 = .42, .74^2 = .55, and .38^2 = .14. These reliability estimates vary considerably due to the small sample size, but the loading of the first IAT would suggest that only 19% of the variance in a single IAT is reliable. As reliablity is the upper limit for validity, it would imply that no more than 20% of the variance in a single IAT captures variation in implicit racial attitudes.
The authors caution readers about the use of a single IAT to measure implicit attitudes. “When using latency-based measures as indices of individual differences, it may be essential to employ analytic techniques, such as covariance structure modeling, that can separate measurement error from a measure of individual differences. Without such analyses, estimates of relationships involving implicit measures may produce misleading null results” (p. 169). However, the authors fail to mention that the low reliability of a single IAT also has important implications for the use of the IAT for the assessment of implicit prejudice. Given this low estimate of validity, users of the Harvard website that provides information about individual’s performance on the IAT should be warned that the feedback is neither reliable nor valid by conventional standards for psychological tests.
Reanalysis of Published Correlation Matrix
The Table below reproduces the correlation matrix. The standard deviations in the last row are rescaled to avoid rounding problems. This has no effect on the results.
.78 .82 1
.76 .77 .86 1
.21 .15 .15 .14 1
.13 .14 .10 .08 .31 1
.16 .26 .23 .20 .42 .50 1
.14 .17 .16 .13 .16 .33 .17 1
.20 .16 .19 .26 .33 .11 .23 .07 1
.26 .29 .18 .19 .20 .27 .36 .29 .26 1
.35 .33 .34 .25 .28 .29 .34 .33 .36 .39 1
.19 .17 .08 .07 .12 .25 .30 .14 .01 .17 .24 1
.00 .11 .07 .04 .27 .18 .19 .02 .03 .01 .02 .07 1
.16 .08 .04 .08 .26 .27 .24 .22 .14 .32 .32 .17 .13 1
.12 .01 .02 .07 .13 .19 .18 .00 .02 .00 .11 .04 .17 .30 1
.33 .18 .26 .31 .14 .24 .31 .15 .22 .20 .27 .04 .01 .48 .42 1
SD 0.84 0.82 0.88 0.86 2.2066 1.2951 1.0130 0.9076 1.2 1.0 1.1 1.0 0.7 0.8 0.8 0.9
1-4 = Modern Racism Scale (1-4); 5-8 Implicit Association Test (1-4); 9-12 = Response Window IAT (1-4); 13-16 Response Window Evaluative Priming (1-4)
Fitting the data to the original model reproduced the original results. I then fitted the data to a model with a single attitude factor (see Figure 1). The model also allowed for measure-specific variances. An initial model showed no significant measure-specific variances for the two versions of the IAT . Hence, these method factors were not included in the final model. To control for variance that is clearly consciously accessible, I modeled the relationship between the explicit factor and the attitude factor as a causal path from the explicit factor to the attitude factor. This path should not be interpreted as a causal relationship in this case. Rather the path can be used to estimate how much of the variance in the attitude factor is explained by consciously accessible information that influences the explicit measure. In this model, the residual variance is variation that is shared among implicit measures, but not with the explicit measure.
The model had good fit to the data. I then imposed constraints on factor loadings. The constrained model had better fit than the unconstrained model (delta AIC = 4.60, delta BIC = 43.53). The main finding is that the standard IAT had a loading of .55 on the attitude factor. The indirect path from the implicit attitude factor to a single IAT measure is only slightly smaller, .55*.92 = .51. The 95%CI for this parameter ranged from .41 to .60. The upper bound of the 95%CI would imply that at most 36% of the variance in a single IAT reflects implicit racial attitudes. However, it is important to note that the model in Figure 1 assumes that the Modern Racism Scale is a perfectly valid measure of consciously accessible attitudes. Any systematic measurement error in the Modern Racism Scale would reduce the amount of variance in the attitude factor that reflects unconscious factors. Again, the lack of multiple explicit measures makes it impossible to separate systematic measurement error from valid variance in explicit measures. Thus, the amount of variance in a single IAT that reflects unconscious racial attitudes can range from 0 to 36%.
How Variable are Implicit Racial Attitudes?
The design repeated measurement of implicit attitudes on four occasions. If recent experiences influence implicit attitudes, we would expect that implicit measures of attitudes on the same occasion are more highly correlated with each other than implicit measures taken on different occasions. Given the low validity of implicit attitude measures, I examined this question with constrained parameters. By estimating a single parameter, the model has more power to reveal a consistent relationship between implicit measures that were obtained during the same testing session. Neither the two IATs, nor the IAT and the evaluative priming task (EP) showed significant occasion-specific variance. Although this finding may be due to low power to detect occasion specific variation, this finding suggests that most of the variance in an IAT is due to stable variation and random measurement error.
Cunningham et al. (2001) conducted a rigorous psychometric study of the Implicit Association Test. The original article reported results that could be reproduced. The authors correctly interpret their results as evidence that a single IAT has low reliability. However, they falsely imply that their results provide evidence that the IAT and other implicit measures are valid measures of an implicit form of racism that is not consciously accessible. My new analysis shows that their results are consistent with this hypothesis, if one assumes that the Modern Racism Scale is a perfectly valid measure of consciously accessible racial attitudes. Under this assumption, about 25% (95%CI 16-36) of the variance in a single IAT would reflect implicit attitudes. However, it is rather unlikely that the Modern Racism Scale is a perfect measure of explicit racial attitudes, and the amount of variance in performance on the IAT that reflects unconscious racism is likely to be smaller. Another important finding that was implicit, but not explicitly mentioned, in the original model is that there is no evidence for situation-specific variation in implicit attitudes. At least over the one-month period of the study, racial attitudes remained stable and did not vary as a function of naturally occurring events that might influence racial attitudes (e.g., positive or negative intergroup contact). This finding may explain why experimental manipulations of implicit attitudes also often produce very small effects (Joy Gaba & Nosek, 2010).
One surprising finding was that the IAT showed no systematic measurement error in this model. This would imply that repeated measures of the IAT could be used to measure racial attitudes with high validity. Unfortunately, most studies with the IAT rely on a single testing situation and ignore that most of the variance in a single IAT is measurement error. To improve research on racial attitudes and prejudice, social psychologists should use multiple explicit and implicit measures and use structural equation models to examine which variance components of a measurement model of racial attitudes predict actual behavior.