Abstract
Mirroring clinical guidelines, recent Performance Validity Test (PVT) research emphasizes using ≥ 2 criterion PVTs to optimally identify validity groups when validating/cross-validating PVTs; however, even with multiple measures, the effect of which specific PVTs are used as criterion measures remains incompletely explored. This study investigated the accuracy of varying two-PVT combinations for establishing validity status and how adding a third PVT or applying more liberal failure cut-scores affects overall false-positive (FP)/-negative (FN) rates. Clinically referred veterans (N = 114; 30% clinically identified as invalid) completing a six-PVT protocol as during their evaluation were included. Concordance rates were calculated across all possible two-and three-PVT combinations at conservative and liberal cutoffs. Two-PVT combinations classified 72–91% of valid (0–4% FPs) and 17–74% of invalid (0–40% FNs) cases, and three-PVT combinations classified 67–86% of valid (0–6% FPs) and 57–97% of invalid (0–24% FNs) at conservative cutoffs. Liberal cutoffs classified 53–86% of valid (0–15% FPs) and 39–82% of invalid (0–30% FNs) cases for two-PVT combinations and 46–75% of valid (3–27% FPs) and 60–97% of invalid (0–17% FNs) cases for three-PVT combinations. Irrespective of whether a two-or three-PVT combination or conservative/liberal cutoffs were used, many valid and invalid cases failed only one PVT (3–68%).Two-PVT combinations produced high FNs and were less accurate than three-PVTs for detecting invalid cases, though variable accuracy was found within both types of combinations based on the specific PVTs in the combination. Thus, both PVT quantity and quality are important for accurate validity classification in research studies to ensure reliability and replicability of findings. Applying more liberal cutoffs yielded increased sensitivity, but with generally higher FPs yielding problematic specificity, particularly for three-PVT combinations.
Similar content being viewed by others
Notes
One reviewer remarked that a single PVT failure using liberal cutoffs is not equivocal, but rather should be considered valid. While this position is tenable in a context in which multiple PVTs are administered, we maintain that when only two PVTs are given, one failure is, by definition, equivocal in that other extra-test data would ultimately need to be considered to establish validity status. Our objective data clearly demonstrated that even when liberal cutoffs were applied, for two-PVT combinations, mean failure rates of 1/2 PVTs were 35% for invalid cases and 28% for valid cases (Tables 4 and 5). Thus, nearly 30% of invalid cases would be misclassified if cases with one failure were automatically classified as valid. Additionally, in response to the significant increase in false positives for the three-PVT combinations using liberal cutoffs (i.e., > 10% on 7/19 combinations), the reviewer suggested that the solution to improve specificity (i.e., ≥ 90%; Boone, 2012) was to raise the invalidity threshold to 3/3 PVT failures. However, this approach is not consistent with current practice standards, in which ≥ 2 failures is the generally accepted benchmark for identifying probable invalidity (Larrabee, 2014; Meyers & Volbrecht, 2003), and would result in an unacceptable decrease in overall mean sensitivity from 79 to 35% for identifying invalid cases, whereas using conservative cutoffs and retaining the well-established ≥ 2 failures benchmark yielded 72% mean sensitivity while maintaining 97% mean specificity and 0/19 combinations with a false-positive rate above 6% (see Tables 6 and 7).
References
Alverson, W. A., O’Rourke, J. J. F., & Soble, J. R. (2019). The word memory test genuine memory impairment profile discriminates genuine memory impairment from invalid performance in a mixed clinical sample with cognitive impairment. The Clinical Neuropsychologist.
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders - fifth edition (DSM-5). Washington, DC: American Psychiatric Publishing.
An, K. Y., Kaploun, K., Erdodi, L. A., & Abeare, C. A. (2017). Performance validity in undergraduate research participants: a comparison of failure rates across tests and cutoffs. The Clinical Neuropsychologist, 31, 193–206.
Armistead-Jehle, P., Cooper, D. B., & Vanderploeg, R. D. (2016). The role of performance validity tests in the assessment of cognitive functioning after military concussion: a replication and extension. Applied Neuropsychology: Adult, 23(4), 264–273.
Armistead-Jehle, P., Soble, J. R., Cooper, D. C., & Belanger, H. G. (2017). Unique aspects of TBI in military populations. [Special issue]. Physical Medicine & Rehabilitation Clinics of North America, 28, 323–337.
Bailey, K. C., Soble, J. R., Bain, K. M., & Fullen, C. (2018a). Embedded performance validity tests in the Hopkins verbal learning test – revised and the brief visuospatial memory test – revised: a replication study. Archives of Clinical Neuropsychology, 33, 895–900.
Bailey, K. C., Soble, J. R., & O’Rourke, J. J. F. (2018b). Clinical utility of the Rey 15-Item Test, recognition trial, and error scores for detecting noncredible neuropsychological performance validity in a mixed-clinical sample of veterans. The Clinical Neuropsychologist, 32, 119–131.
Bain, K. M., & Soble, J. R. (2019). Validation of the advanced clinical solutions word choice test (WCT) in a mixed clinical sample: establishing classification accuracy, sensitivity/specificity, and cutoff scores. Assessment, 26, 1320–1328.
Bain, K. M., Soble, J. R., Webber, T. A., Messerly, J. M., Bailey, K. C., Kirton, J. W., & McCoy, K. J. M. (2019). Cross-validation of three advanced clinical solutions performance validity tests: examining combinations of measures to maximize classification of invalid performance. Applied Neuropsychology: Adult.
Boone, K. B. (2009). The need for continuous and comprehensive sampling of effort/response bias during neuropsychological examinations. The Clinical Neuropsychologist, 23, 729–741.
Boone, K. B. (2012). Clinical practice of forensic neuropsychology. New York: Guilford Press.
Boone, K., Lu, P., & Herzberg, D. S. (2002). The dot counting test manual. Los Angeles: Western Psychological Services.
Critchfield, E. A., Soble, J. R., Marceaux, J. C., Bain, K. M., Bailey, K. C., Webber, T. A., et al. (2019). Cognitive impairment does not cause performance validity failure: analyzing performance patterns among unimpaired, impaired, and noncredible participants across six tests. The Clinical Neuropsychologist, 6, 1083–1101.
Denning, J. H. (2012). The efficiency and accuracy of the Test of Memory Malingering trial 1, errors on the first 10 items of the Test of Memory malingering, and five embedded measures in predicting invalid test performance. Archives of Clinical Neuropsychology, 27(4), 417–432.
Erdodi, L. A. (2019). Aggregating validity indicators: the salience of domain specificity and the indeterminate range in multivariate models of performance validity assessment. Applied Neuropsychology: Adult, 26(2), 155–172.
Erdodi, L. A., Kirsch, N. L., Lajiness-O’Neill, R., Vingilis, E., & Medoff, B. (2014). Comparing the Recognition Memory Test and the Word Choice Test in a mixed clinical sample: are they equivalent? Psychological Injury and Law, 7, 255–263.
Fazio, R. L., Faris, A. N., & Yamout, K. Z. (2019). Use of the Rey 15-Item Test as a performance validity test in an elderly population. Applied Neuropsychology: Adult, 26, 28–35.
Gasquoine, P. G., Weimer, A. A., & Amador, A. (2017). Specificity rates for non-clinical, bilingual, Mexican Americans on three popular performance validity measures. The Clinical Neuropsychologist, 31(3), 587–597. https://doi.org/10.1080/13854046.2016.1277786.
Green, P. (2003). Green’s word memory test for windows: user’s manual. Edmonton: Green’s Publishing.
Green, P., Montijo, J., & Brockhaus, R. (2011). High specificity of the Word Memory Test and Medical Symptom Validity Test in groups with severe verbal memory impairment. Applied Neuropsychology, 18(2), 86–94.
Greiffenstein, M. F., Baker, W. J., & Gola, T. (1994). Validation of malingered amnesia measures with a large clinical sample. Psychological Assessment, 6(3), 218–224.
Grills, C. E., & Armistead-Jehle, P. (2016). Performance validity test and neuropsychological assessment battery screening module performances in an active-duty sample with a history of concussion. Applied Neuropsychology: Adult, 23(4), 295–301.
Larrabee, G. J. (2008). Aggregation across multiple indicators improves the detection of malingering: relationship to likelihood ratios. The Clinical Neuropsychologist, 22(4), 666–679.
Larrabee, G. J. (2014). False-positive rates associated with the use of multiple performance and symptom validity tests. Archives of Clinical Neuropsychology, 29(4), 364–373.
Larrabee, G. J., Rohling, M. L., & Meyers, J. E. (2019). Use of multiple performance and symptom validity measures: determining the optimal per test cutoff for determination of invalidity, analysis of skew, and inter-test correlations in valid and invalid performance groups. The Clinical Neuropsychologist, 1–19.
Lezak, M. D., Howieson, D. B., Bigler, E. D., & Tranel, D. (2012). Neuropsychological assessment (5th ed.). Oxford: Oxford University Press.
Lippa, S. M. (2018). Performance validity testing in neuropsychology: a clinical guide, critical review, and update on a rapidly evolving literature. The Clinical Neuropsychologist, 32(3), 391–421.
Loring, D. W., Goldstein, F. C., Chen, C., Drane, D. L., Lah, J. J., Zhao, L., & Larrabee, G. J. (2016). False-positive error rates for Reliable Digit Span and Auditory Verbal Learning Test performance validity measures in amnestic mild cognitive impairment and early Alzheimer disease. Archives of Clinical Neuropsychology, 31(4), 313–331.
Martin, P. K., Schroeder, R. W., Olsen, D. H., Maloy, H., Boettcher, A., Ernst, N., & Okut, H. (2019). A systematic review and meta-analysis of the Test of Memory Malingering in adults: two decades of deception detection. The Clinical Neuropsychologist. https://doi.org/10.1080/13854046.2019.1637027.
Martin, P. K., Schroeder, R. W., & Odland, A. P. (2015). Neuropsychologists’ validity testing beliefs and practices: a survey of North American professionals. The Clinical Neuropsychologist, 29(6), 741–776.
Meyers, J. E., Miller, R. M., Thompson, L. M., Scalese, A. M., Allred, B. C., Rupp, Z. W., Dupaix, Z. P., & Junghyun Lee, A. (2014). Using likelihood ratios to detect invalid performance with performance validity measures. Archives of Clinical Neuropsychology, 29(3), 224–235.
Meyers, J. E., & Volbrecht, M. (2003). A validation of multiple malingering detection methods in a large clinical sample. Archives of Clinical Neuropsychology, 18, 261–276.
Novitski, J., Steele, S., Karantzoulis, S., & Randolph, C. (2012). The repeatable battery for the assessment of neuropsychological status effort scale. Archives of Clinical Neuropsychology, 27, 190–195.
Pearson. (2009). Advanced clinical solutions for WAIS-IV and WMS-IV: clinical and interpretive manual. San Antonio: Pearson.
Poreh, A., Bezdicek, O., Korobkova, I., Levin, J. B, & Dines, P. (2016). The Rey Auditory Verbal Learning Test forced-choice recognition task: base-rate data and norms. Applied Neuropsychology: Adult, 23, 155–161.
Poynter, K., Boone, K. B., Ermshar, A., Miora, D., Cottingham, M., Victor, T. L., Ziegler, E., Zeller, M. A., & Wright, M. (2019). Wait, there’s a baby in this bath water! Update on quantitative and qualitative cut-offs for Rey 15-Item Recall and Recognition. Archives of Clinical Neuropsychology, 34, 1367–1380.
Rai, J. K., & Erdodi, L. A. (2019). Impact of criterion measures on the classification accuracy of TOMM-1. Applied Neuropsychology: Adult.
Rey, A. (1964). L’examen Clinique en psychologie. Paris: Presses Universitaires de France.
Schroeder, R. W., Martin, P. K., Heinrichs, R. J., & Baade, L. E. (2019). Research methods in performance validity testing studies: criterion grouping approach impacts study outcomes. The Clinical Neuropsychologist, 33, 466–477.
Schroeder, R. W., Twumasi-Ankrah, P., Baade, L. E., & Marshall, P. S. (2012). Reliable digit span: a systematic review and cross-validation study. Assessment, 19(1), 21–30.
Schwartz, E. S., Erdodi, L., Rodriguez, N., Ghosh, J. J., Curtain, J. R., Flashman, L. A., & Roth, R. M. (2016). CVLT-II forced choice recognition trial as an embedded validity indicator: a systematic review of the evidence. Journal of the International Neuropsychological Society, 22, 851–858.
Silverberg, N. D., Wertheimer, J. C., & Fichtenberg, N. L. (2007). An effort index for the Repeatable Battery For The Assessment Of Neuropsychological Status (RBANS). The Clinical Neuropsychologist, 21(5), 841–854.
Slick, D. J., Sherman, E. M., & Iverson, G. L. (1999). Diagnostic criteria for malingered neurocognitive dysfunction: proposed standards for clinical practice and research. The Clinical Neuropsychologist, 13(4), 545–561.
Slick, D. J., Tan, J. E., Strauss, E. H., & Hultsch, D. F. (2004). Detecting malingering: a survey of experts’ practices. Archives of Clinical Neuropsychology, 19(4), 465–473.
Tombaugh, T. N. (1996). Test of memory malingering (TOMM). North Tonawanda: Multi Health Systems.
Webber, T. A., Bailey, K. C., Alverson, W. A., Critchfield, E. A., Bain, K. M., Messerly, J. M., et al. (2018a). Further validation of the Test of Memory Malingering (TOMM) Trial 1: examination of false positives and convergence with other validity measures. Psychological Injury and Law, 11, 325–335.
Webber, T. A., Critchfield, E. A., & Soble, J. R. (2018b). Convergent, discriminant, and concurrent validity of non-memory-based performance validity tests. Assessment.
Webber, T. A., Marceaux, J. C., Critchfield, E. A., & Soble, J. R. (2018c). Relative impacts of mild and major neurocognitive disorder on rate of verbal learning acquisition. Archives of Clinical Neuropsychology.
Webber, T. A., & Soble, J. R. (2018). Utility of various WAIS-IV digit span indices for identifying noncredible performance validity among cognitively impaired and unimpaired examinees. The Clinical Neuropsychologist, 32(4), 657–670.
Wechsler, D. (2008). WAIS-IV: administration and scoring manual. San Antonio: Pearson.
Weinstein, S., Obuchowski, N. A., & Lieber, M. L. (2005). Clinical evaluation of diagnostic tests. American Journal of Roentgenology, 184(1), 141–149.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Disclaimer
The views expressed herein are those of the authors and do not necessarily reflect the views or the official policy of the Department of Veterans Affairs or US Government.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Soble, J.R., Alverson, W.A., Phillips, J.I. et al. Strength in Numbers or Quality over Quantity? Examining the Importance of Criterion Measure Selection to Define Validity Groups in Performance Validity Test (PVT) Research. Psychol. Inj. and Law 13, 44–56 (2020). https://doi.org/10.1007/s12207-019-09370-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12207-019-09370-w