Strength in Numbers or Quality over Quantity? Examining the Importance of Criterion Measure Selection to Define Validity Groups in Performance Validity Test (PVT) Research

Soble, Jason R.; Alverson, W. Alexander; Phillips, Jacob I.; Critchfield, Edan A.; Fullen, Chrystal; O’Rourke, Justin J. F.; Messerly, Johanna; Highsmith, Jonathan M.; Bailey, K. Chase; Webber, Troy A.; Marceaux, Janice C.

doi:10.1007/s12207-019-09370-w

Strength in Numbers or Quality over Quantity? Examining the Importance of Criterion Measure Selection to Define Validity Groups in Performance Validity Test (PVT) Research

Published: 09 January 2020

Volume 13, pages 44–56, (2020)
Cite this article

Psychological Injury and Law Aims and scope Submit manuscript

Jason R. Soble ORCID: orcid.org/0000-0003-3348-8762^1,2,
W. Alexander Alverson³,
Jacob I. Phillips³,
Edan A. Critchfield³,
Chrystal Fullen³,
Justin J. F. O’Rourke³,
Johanna Messerly³,
Jonathan M. Highsmith³,
K. Chase Bailey^3,4,
Troy A. Webber⁵ &
…
Janice C. Marceaux³

1032 Accesses
50 Citations
6 Altmetric
1 Mention
Explore all metrics

Abstract

Mirroring clinical guidelines, recent Performance Validity Test (PVT) research emphasizes using ≥ 2 criterion PVTs to optimally identify validity groups when validating/cross-validating PVTs; however, even with multiple measures, the effect of which specific PVTs are used as criterion measures remains incompletely explored. This study investigated the accuracy of varying two-PVT combinations for establishing validity status and how adding a third PVT or applying more liberal failure cut-scores affects overall false-positive (FP)/-negative (FN) rates. Clinically referred veterans (N = 114; 30% clinically identified as invalid) completing a six-PVT protocol as during their evaluation were included. Concordance rates were calculated across all possible two-and three-PVT combinations at conservative and liberal cutoffs. Two-PVT combinations classified 72–91% of valid (0–4% FPs) and 17–74% of invalid (0–40% FNs) cases, and three-PVT combinations classified 67–86% of valid (0–6% FPs) and 57–97% of invalid (0–24% FNs) at conservative cutoffs. Liberal cutoffs classified 53–86% of valid (0–15% FPs) and 39–82% of invalid (0–30% FNs) cases for two-PVT combinations and 46–75% of valid (3–27% FPs) and 60–97% of invalid (0–17% FNs) cases for three-PVT combinations. Irrespective of whether a two-or three-PVT combination or conservative/liberal cutoffs were used, many valid and invalid cases failed only one PVT (3–68%).Two-PVT combinations produced high FNs and were less accurate than three-PVTs for detecting invalid cases, though variable accuracy was found within both types of combinations based on the specific PVTs in the combination. Thus, both PVT quantity and quality are important for accurate validity classification in research studies to ensure reliability and replicability of findings. Applying more liberal cutoffs yielded increased sensitivity, but with generally higher FPs yielding problematic specificity, particularly for three-PVT combinations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Seeing Clearly in the Twilight: The Clinical and Forensic Relevance of the Indeterminate/Borderline Range in Multivariate Models of Performance Validity Assessment

Article 08 March 2024

Questioning What We Thought We Knew: Commentary on Leonhard’s Performance Validity Assessment Articles

Article 18 August 2023

RBANS Validity Indices: a Systematic Review and Meta-Analysis

Article 16 May 2018

Notes

One reviewer remarked that a single PVT failure using liberal cutoffs is not equivocal, but rather should be considered valid. While this position is tenable in a context in which multiple PVTs are administered, we maintain that when only two PVTs are given, one failure is, by definition, equivocal in that other extra-test data would ultimately need to be considered to establish validity status. Our objective data clearly demonstrated that even when liberal cutoffs were applied, for two-PVT combinations, mean failure rates of 1/2 PVTs were 35% for invalid cases and 28% for valid cases (Tables 4 and 5). Thus, nearly 30% of invalid cases would be misclassified if cases with one failure were automatically classified as valid. Additionally, in response to the significant increase in false positives for the three-PVT combinations using liberal cutoffs (i.e., > 10% on 7/19 combinations), the reviewer suggested that the solution to improve specificity (i.e., ≥ 90%; Boone, 2012) was to raise the invalidity threshold to 3/3 PVT failures. However, this approach is not consistent with current practice standards, in which ≥ 2 failures is the generally accepted benchmark for identifying probable invalidity (Larrabee, 2014; Meyers & Volbrecht, 2003), and would result in an unacceptable decrease in overall mean sensitivity from 79 to 35% for identifying invalid cases, whereas using conservative cutoffs and retaining the well-established ≥ 2 failures benchmark yielded 72% mean sensitivity while maintaining 97% mean specificity and 0/19 combinations with a false-positive rate above 6% (see Tables 6 and 7).

References

Alverson, W. A., O’Rourke, J. J. F., & Soble, J. R. (2019). The word memory test genuine memory impairment profile discriminates genuine memory impairment from invalid performance in a mixed clinical sample with cognitive impairment. The Clinical Neuropsychologist.
American Psychiatric Association. (2013). Diagnostic and statistical manual of mental disorders - fifth edition (DSM-5). Washington, DC: American Psychiatric Publishing.
Google Scholar
An, K. Y., Kaploun, K., Erdodi, L. A., & Abeare, C. A. (2017). Performance validity in undergraduate research participants: a comparison of failure rates across tests and cutoffs. The Clinical Neuropsychologist, 31, 193–206.
PubMed Google Scholar
Armistead-Jehle, P., Cooper, D. B., & Vanderploeg, R. D. (2016). The role of performance validity tests in the assessment of cognitive functioning after military concussion: a replication and extension. Applied Neuropsychology: Adult, 23(4), 264–273.
Google Scholar
Armistead-Jehle, P., Soble, J. R., Cooper, D. C., & Belanger, H. G. (2017). Unique aspects of TBI in military populations. [Special issue]. Physical Medicine & Rehabilitation Clinics of North America, 28, 323–337.
Google Scholar
Bailey, K. C., Soble, J. R., Bain, K. M., & Fullen, C. (2018a). Embedded performance validity tests in the Hopkins verbal learning test – revised and the brief visuospatial memory test – revised: a replication study. Archives of Clinical Neuropsychology, 33, 895–900.
PubMed Google Scholar
Bailey, K. C., Soble, J. R., & O’Rourke, J. J. F. (2018b). Clinical utility of the Rey 15-Item Test, recognition trial, and error scores for detecting noncredible neuropsychological performance validity in a mixed-clinical sample of veterans. The Clinical Neuropsychologist, 32, 119–131.
PubMed Google Scholar
Bain, K. M., & Soble, J. R. (2019). Validation of the advanced clinical solutions word choice test (WCT) in a mixed clinical sample: establishing classification accuracy, sensitivity/specificity, and cutoff scores. Assessment, 26, 1320–1328.
PubMed Google Scholar
Bain, K. M., Soble, J. R., Webber, T. A., Messerly, J. M., Bailey, K. C., Kirton, J. W., & McCoy, K. J. M. (2019). Cross-validation of three advanced clinical solutions performance validity tests: examining combinations of measures to maximize classification of invalid performance. Applied Neuropsychology: Adult.
Boone, K. B. (2009). The need for continuous and comprehensive sampling of effort/response bias during neuropsychological examinations. The Clinical Neuropsychologist, 23, 729–741.
PubMed Google Scholar
Boone, K. B. (2012). Clinical practice of forensic neuropsychology. New York: Guilford Press.
Google Scholar
Boone, K., Lu, P., & Herzberg, D. S. (2002). The dot counting test manual. Los Angeles: Western Psychological Services.
Google Scholar
Critchfield, E. A., Soble, J. R., Marceaux, J. C., Bain, K. M., Bailey, K. C., Webber, T. A., et al. (2019). Cognitive impairment does not cause performance validity failure: analyzing performance patterns among unimpaired, impaired, and noncredible participants across six tests. The Clinical Neuropsychologist, 6, 1083–1101.
Google Scholar
Denning, J. H. (2012). The efficiency and accuracy of the Test of Memory Malingering trial 1, errors on the first 10 items of the Test of Memory malingering, and five embedded measures in predicting invalid test performance. Archives of Clinical Neuropsychology, 27(4), 417–432.
PubMed Google Scholar
Erdodi, L. A. (2019). Aggregating validity indicators: the salience of domain specificity and the indeterminate range in multivariate models of performance validity assessment. Applied Neuropsychology: Adult, 26(2), 155–172.
Google Scholar
Erdodi, L. A., Kirsch, N. L., Lajiness-O’Neill, R., Vingilis, E., & Medoff, B. (2014). Comparing the Recognition Memory Test and the Word Choice Test in a mixed clinical sample: are they equivalent? Psychological Injury and Law, 7, 255–263.
Google Scholar
Fazio, R. L., Faris, A. N., & Yamout, K. Z. (2019). Use of the Rey 15-Item Test as a performance validity test in an elderly population. Applied Neuropsychology: Adult, 26, 28–35.
Google Scholar
Gasquoine, P. G., Weimer, A. A., & Amador, A. (2017). Specificity rates for non-clinical, bilingual, Mexican Americans on three popular performance validity measures. The Clinical Neuropsychologist, 31(3), 587–597. https://doi.org/10.1080/13854046.2016.1277786.
Article PubMed Google Scholar
Green, P. (2003). Green’s word memory test for windows: user’s manual. Edmonton: Green’s Publishing.
Google Scholar
Green, P., Montijo, J., & Brockhaus, R. (2011). High specificity of the Word Memory Test and Medical Symptom Validity Test in groups with severe verbal memory impairment. Applied Neuropsychology, 18(2), 86–94.
PubMed Google Scholar
Greiffenstein, M. F., Baker, W. J., & Gola, T. (1994). Validation of malingered amnesia measures with a large clinical sample. Psychological Assessment, 6(3), 218–224.
Google Scholar
Grills, C. E., & Armistead-Jehle, P. (2016). Performance validity test and neuropsychological assessment battery screening module performances in an active-duty sample with a history of concussion. Applied Neuropsychology: Adult, 23(4), 295–301.
Google Scholar
Larrabee, G. J. (2008). Aggregation across multiple indicators improves the detection of malingering: relationship to likelihood ratios. The Clinical Neuropsychologist, 22(4), 666–679.
PubMed Google Scholar
Larrabee, G. J. (2014). False-positive rates associated with the use of multiple performance and symptom validity tests. Archives of Clinical Neuropsychology, 29(4), 364–373.
PubMed Google Scholar
Larrabee, G. J., Rohling, M. L., & Meyers, J. E. (2019). Use of multiple performance and symptom validity measures: determining the optimal per test cutoff for determination of invalidity, analysis of skew, and inter-test correlations in valid and invalid performance groups. The Clinical Neuropsychologist, 1–19.
Lezak, M. D., Howieson, D. B., Bigler, E. D., & Tranel, D. (2012). Neuropsychological assessment (5th ed.). Oxford: Oxford University Press.
Google Scholar
Lippa, S. M. (2018). Performance validity testing in neuropsychology: a clinical guide, critical review, and update on a rapidly evolving literature. The Clinical Neuropsychologist, 32(3), 391–421.
PubMed Google Scholar
Loring, D. W., Goldstein, F. C., Chen, C., Drane, D. L., Lah, J. J., Zhao, L., & Larrabee, G. J. (2016). False-positive error rates for Reliable Digit Span and Auditory Verbal Learning Test performance validity measures in amnestic mild cognitive impairment and early Alzheimer disease. Archives of Clinical Neuropsychology, 31(4), 313–331.
PubMed PubMed Central Google Scholar
Martin, P. K., Schroeder, R. W., Olsen, D. H., Maloy, H., Boettcher, A., Ernst, N., & Okut, H. (2019). A systematic review and meta-analysis of the Test of Memory Malingering in adults: two decades of deception detection. The Clinical Neuropsychologist. https://doi.org/10.1080/13854046.2019.1637027.
PubMed Google Scholar
Martin, P. K., Schroeder, R. W., & Odland, A. P. (2015). Neuropsychologists’ validity testing beliefs and practices: a survey of North American professionals. The Clinical Neuropsychologist, 29(6), 741–776.
PubMed Google Scholar
Meyers, J. E., Miller, R. M., Thompson, L. M., Scalese, A. M., Allred, B. C., Rupp, Z. W., Dupaix, Z. P., & Junghyun Lee, A. (2014). Using likelihood ratios to detect invalid performance with performance validity measures. Archives of Clinical Neuropsychology, 29(3), 224–235.
PubMed Google Scholar
Meyers, J. E., & Volbrecht, M. (2003). A validation of multiple malingering detection methods in a large clinical sample. Archives of Clinical Neuropsychology, 18, 261–276.
PubMed Google Scholar
Novitski, J., Steele, S., Karantzoulis, S., & Randolph, C. (2012). The repeatable battery for the assessment of neuropsychological status effort scale. Archives of Clinical Neuropsychology, 27, 190–195.
PubMed Google Scholar
Pearson. (2009). Advanced clinical solutions for WAIS-IV and WMS-IV: clinical and interpretive manual. San Antonio: Pearson.
Google Scholar
Poreh, A., Bezdicek, O., Korobkova, I., Levin, J. B, & Dines, P. (2016). The Rey Auditory Verbal Learning Test forced-choice recognition task: base-rate data and norms. Applied Neuropsychology: Adult, 23, 155–161.
Poynter, K., Boone, K. B., Ermshar, A., Miora, D., Cottingham, M., Victor, T. L., Ziegler, E., Zeller, M. A., & Wright, M. (2019). Wait, there’s a baby in this bath water! Update on quantitative and qualitative cut-offs for Rey 15-Item Recall and Recognition. Archives of Clinical Neuropsychology, 34, 1367–1380.
PubMed Google Scholar
Rai, J. K., & Erdodi, L. A. (2019). Impact of criterion measures on the classification accuracy of TOMM-1. Applied Neuropsychology: Adult.
Rey, A. (1964). L’examen Clinique en psychologie. Paris: Presses Universitaires de France.
Google Scholar
Schroeder, R. W., Martin, P. K., Heinrichs, R. J., & Baade, L. E. (2019). Research methods in performance validity testing studies: criterion grouping approach impacts study outcomes. The Clinical Neuropsychologist, 33, 466–477.
PubMed Google Scholar
Schroeder, R. W., Twumasi-Ankrah, P., Baade, L. E., & Marshall, P. S. (2012). Reliable digit span: a systematic review and cross-validation study. Assessment, 19(1), 21–30.
PubMed Google Scholar
Schwartz, E. S., Erdodi, L., Rodriguez, N., Ghosh, J. J., Curtain, J. R., Flashman, L. A., & Roth, R. M. (2016). CVLT-II forced choice recognition trial as an embedded validity indicator: a systematic review of the evidence. Journal of the International Neuropsychological Society, 22, 851–858.
PubMed Google Scholar
Silverberg, N. D., Wertheimer, J. C., & Fichtenberg, N. L. (2007). An effort index for the Repeatable Battery For The Assessment Of Neuropsychological Status (RBANS). The Clinical Neuropsychologist, 21(5), 841–854.
PubMed Google Scholar
Slick, D. J., Sherman, E. M., & Iverson, G. L. (1999). Diagnostic criteria for malingered neurocognitive dysfunction: proposed standards for clinical practice and research. The Clinical Neuropsychologist, 13(4), 545–561.
PubMed Google Scholar
Slick, D. J., Tan, J. E., Strauss, E. H., & Hultsch, D. F. (2004). Detecting malingering: a survey of experts’ practices. Archives of Clinical Neuropsychology, 19(4), 465–473.
PubMed Google Scholar
Tombaugh, T. N. (1996). Test of memory malingering (TOMM). North Tonawanda: Multi Health Systems.
Google Scholar
Webber, T. A., Bailey, K. C., Alverson, W. A., Critchfield, E. A., Bain, K. M., Messerly, J. M., et al. (2018a). Further validation of the Test of Memory Malingering (TOMM) Trial 1: examination of false positives and convergence with other validity measures. Psychological Injury and Law, 11, 325–335.
Google Scholar
Webber, T. A., Critchfield, E. A., & Soble, J. R. (2018b). Convergent, discriminant, and concurrent validity of non-memory-based performance validity tests. Assessment.
Webber, T. A., Marceaux, J. C., Critchfield, E. A., & Soble, J. R. (2018c). Relative impacts of mild and major neurocognitive disorder on rate of verbal learning acquisition. Archives of Clinical Neuropsychology.
Webber, T. A., & Soble, J. R. (2018). Utility of various WAIS-IV digit span indices for identifying noncredible performance validity among cognitively impaired and unimpaired examinees. The Clinical Neuropsychologist, 32(4), 657–670.
PubMed Google Scholar
Wechsler, D. (2008). WAIS-IV: administration and scoring manual. San Antonio: Pearson.
Google Scholar
Weinstein, S., Obuchowski, N. A., & Lieber, M. L. (2005). Clinical evaluation of diagnostic tests. American Journal of Roentgenology, 184(1), 141–149.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Psychiatry, University of Illinois College of Medicine, 912 South Wood Street, Chicago, IL, 60612, USA
Jason R. Soble
Department of Neurology, University of Illinois College of Medicine, Chicago, IL, USA
Jason R. Soble
Psychology Service, South Texas Veterans Healthcare System, San Antonio, TX, USA
W. Alexander Alverson, Jacob I. Phillips, Edan A. Critchfield, Chrystal Fullen, Justin J. F. O’Rourke, Johanna Messerly, Jonathan M. Highsmith, K. Chase Bailey & Janice C. Marceaux
Department of Psychiatry, University of Texas Southwestern Medical Center, Dallas, TX, USA
K. Chase Bailey
Rehabilitation and Extended Care Line, Michael E. DeBakey VA Medical Center, Houston, TX, USA
Troy A. Webber

Authors

Jason R. Soble
View author publications
You can also search for this author in PubMed Google Scholar
W. Alexander Alverson
View author publications
You can also search for this author in PubMed Google Scholar
Jacob I. Phillips
View author publications
You can also search for this author in PubMed Google Scholar
Edan A. Critchfield
View author publications
You can also search for this author in PubMed Google Scholar
Chrystal Fullen
View author publications
You can also search for this author in PubMed Google Scholar
Justin J. F. O’Rourke
View author publications
You can also search for this author in PubMed Google Scholar
Johanna Messerly
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan M. Highsmith
View author publications
You can also search for this author in PubMed Google Scholar
K. Chase Bailey
View author publications
You can also search for this author in PubMed Google Scholar
Troy A. Webber
View author publications
You can also search for this author in PubMed Google Scholar
Janice C. Marceaux
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jason R. Soble.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Disclaimer

The views expressed herein are those of the authors and do not necessarily reflect the views or the official policy of the Department of Veterans Affairs or US Government.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Soble, J.R., Alverson, W.A., Phillips, J.I. et al. Strength in Numbers or Quality over Quantity? Examining the Importance of Criterion Measure Selection to Define Validity Groups in Performance Validity Test (PVT) Research. Psychol. Inj. and Law 13, 44–56 (2020). https://doi.org/10.1007/s12207-019-09370-w

Download citation

Received: 02 July 2019
Accepted: 18 December 2019
Published: 09 January 2020
Issue Date: March 2020
DOI: https://doi.org/10.1007/s12207-019-09370-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strength in Numbers or Quality over Quantity? Examining the Importance of Criterion Measure Selection to Define Validity Groups in Performance Validity Test (PVT) Research

Abstract

Access this article

Similar content being viewed by others

Seeing Clearly in the Twilight: The Clinical and Forensic Relevance of the Indeterminate/Borderline Range in Multivariate Models of Performance Validity Assessment

Questioning What We Thought We Knew: Commentary on Leonhard’s Performance Validity Assessment Articles

RBANS Validity Indices: a Systematic Review and Meta-Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Disclaimer

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Strength in Numbers or Quality over Quantity? Examining the Importance of Criterion Measure Selection to Define Validity Groups in Performance Validity Test (PVT) Research

Abstract

Access this article

Similar content being viewed by others

Seeing Clearly in the Twilight: The Clinical and Forensic Relevance of the Indeterminate/Borderline Range in Multivariate Models of Performance Validity Assessment

Questioning What We Thought We Knew: Commentary on Leonhard’s Performance Validity Assessment Articles

RBANS Validity Indices: a Systematic Review and Meta-Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Disclaimer

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation