Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment

Govaerts, Marjan J. B.; van der Vleuten, Cees P. M.; Schuwirth, Lambert W. T.; Muijtjens, Arno M. M.

doi:10.1007/s10459-006-9043-1

Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment

Published: 10 November 2006

Volume 12, pages 239–260, (2007)
Cite this article

Advances in Health Sciences Education Aims and scope Submit manuscript

Marjan J. B. Govaerts¹,
Cees P. M. van der Vleuten¹,
Lambert W. T. Schuwirth¹ &
…
Arno M. M. Muijtjens¹

1608 Accesses
191 Citations
Explore all metrics

Abstract

Context

In-training assessment (ITA), defined as multiple assessments of performance in the setting of day-to-day practice, is an invaluable tool in assessment programmes which aim to assess professional competence in a comprehensive and valid way. Research on clinical performance ratings, however, consistently shows weaknesses concerning accuracy, reliability and validity. Attempts to improve the psychometric characteristics of ITA focusing on standardisation and objectivity of measurement thus far result in limited improvement of ITA-practices.

Purpose

The aim of the paper is to demonstrate that the psychometric framework may limit more meaningful educational approaches to performance assessment, because it does not take into account key issues in the mechanics of the assessment process. Based on insights from other disciplines, we propose an approach to ITA that takes a constructivist, social-psychological perspective and integrates elements of theories of cognition, motivation and decision making. A central assumption in the proposed framework is that performance assessment is a judgment and decision making process, in which rating outcomes are influenced by interactions between individuals and the social context in which assessment occurs.

Discussion

The issues raised in the article and the proposed assessment framework bring forward a number of implications for current performance assessment practice. It is argued that focusing on the context of performance assessment may be more effective in improving ITA practices than focusing strictly on raters and rating instruments. Furthermore, the constructivist approach towards assessment has important implications for assessment procedures as well as the evaluation of assessment quality. Finally, it is argued that further research into performance assessment should contribute towards a better understanding of the factors that influence rating outcomes, such as rater motivation, assessment procedures and other contextual variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Reskilling and Upskilling the Future-ready Workforce for Industry 4.0 and Beyond

Article 13 July 2022

Ling Li

Soft skills, do we know what we are talking about?

Article 02 June 2021

Sara Isabel Marin-Zapata, Juan Pablo Román-Calderón, … Maria Alejandra Jaramillo-Serna

Interventions to improve team effectiveness within health care: a systematic review of the past decade

Article Open access 08 January 2020

Martina Buljac-Samardzic, Kirti D. Doekhie & Jeroen D. H. van Wijngaarden

References

Barneveld van C. (2005). The dependability of medical students’ performance ratings as documented on in-training evaluations. Academic Medicine 80(3): 309–312
Article Google Scholar
Bernardin, H.J., Orban, J.A. & Carlyle J.J. (1981). Performance ratings as a function of trust in appraisal and rater individual differences. Academy of Management Proceedings: 311–315
Borman W.C., Motowidlo S.J. (1997). Task performance and contextual performance: the meaning for personnel selection research. Human Performance 10: 99–109
Article Google Scholar
Cardy R.L., Bernardin H.J., Abbott J.G., Senderak M.P., Taylor K. (1987) The effects of individual performance schemata and dimension familiarization on rating accuracy. Journal of Occupational Psychology 60: 197–205
Google Scholar
Chi M.T.H., Glaser R., Farr M.J. (1989). The Nature of Expertise. Hillsdale, New Jersey
Google Scholar
Clauser B.E., Schuwirth L.W.T. (2002). The use of computers in assessment. In: G.R. Norman, C.P.M. van der Vleuten, D.I. Newble (eds), International Handbook of Research in Medical Education, Kluwer Academic Publishers, Dordrecht, pp.757–792
Google Scholar
Cleveland J.N., Murphy K.R., Williams R.E. (1989). Multiple uses of performance appraisal: prevalence and correlates. Journal of Applied Psychology 74: 130–135
Article Google Scholar
Coderre S., Mandin H., Harasym P.H., Fick G.H. (2003). Diagnostic reasoning strategies and diagnostic success. Medical Education 37: 695–703
Article Google Scholar
Crooks T. (1998). The impact of classroom evaluation practices on students. Review of Educational Research 58(4): 438–481
Article Google Scholar
Delandshere G., Petrosky A.R. (1998). Assessment of complex performances: limitations of key measurement assumptions. Educational Researcher 27(2): 14–24
Article Google Scholar
DeNisi A.S., Peters L.H. (1996). Organization of information in memory and the performance appraisal process: evidence from the field. Journal of Applied Psychology 81(6): 717–737
Article Google Scholar
DeNisi A.S., Robbins T., Cafferty T.P. (1989). Organization of information used for performance appraisals: role of diary-keeping. Journal of Applied Psychology 74(1): 124–129
Article Google Scholar
DeNisi A.S., Williams K.J. (1988). Cognitive approaches to performance appraisal. In: G. Ferris, K. Rowland (eds) Research in Personnel and Human Resource Management (Vol. 6). JAI Press, Greenwich, CT
Google Scholar
Driessen E., Vleuten van der C., Schuwirth L., Tartwijk van J., Vermunt J. (2005). The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: a case study. Medical Education, 39: 214–220
Article Google Scholar
Erdogan B., Kraimer M.L., Liden R.C. (2001). Procedural justice as a two-dimensional construct. An examination in the performance appraisal context. Journal of Applied Behavioural Science 37(2): 205–222
Google Scholar
Eva K.W. (2004). What every teacher needs to know about clinical reasoning. Medical Education 39: 98–106
Article Google Scholar
Fiske S.T., Taylor S.E. (1991). Social Cognition (2nd ed). McGraw-Hill, New York
Google Scholar
Forgas J.P., George J.M. (2001). Affective influences on judgments and behavior in organizations: an information processing perspective. Organizational Behavior and Human Decision Processes 86(1): 3–34
Article Google Scholar
Forgas J.P. (2002). Feeling and doing: influences on interpersonal behavior. Psychological Inquiry 13(1): 1–28
Article Google Scholar
Govaerts M.J.B., Vleuten van der C.P.M., Schuwirth L.W.T., Muijtjens A.M.M. (2005). The use of observational diaries in in-training evaluation: student perceptions. Advances in Health Sciences Education 10: 171–188
Article Google Scholar
Gray J.D. (1996). Global rating scales in residency education. Academic Medicine 71(1): S55–S63
Article Google Scholar
Greguras G.J., Robie C., Schleicher D.J., Goff M. III (2003). A field study of the effects of rating purpose on the quality of multisource ratings. Personnel Psychology 56: 1–20
Article Google Scholar
Guba E., Lincoln Y. (1989). Fourth Generation Evaluation. Sage Publications, London
Google Scholar
Harris M. (1994). Rater motivation in the performance appraisal context: a theoretical framework. Journal of Management 20(4): 737–756
Article Google Scholar
Hauenstein N.M.A. (1992). An information-processing approach to leniency in performance judgments. Journal of Applied Psychology 77(4): 485–493
Article Google Scholar
Hawe E. (2003). It’s pretty difficult to fail: the reluctance of lecturers to award a failing grade. Assessment and Evaluation in Higher Education 28(4): 371–382
Article Google Scholar
Hodgkinson G.P. (2003). The interface of cognitive and industrial, work and organizational psychology. Journal of Occupational and Organizational Psychology 76: 1–25
Article Google Scholar
Hoffman K.G., Donaldson J.F. (2004). Contextual tensions of the clinical environment and their influence on teaching and learning. Medical Education 38: 448–454
Article Google Scholar
Hogg M.A. (2003). Introducing social psychology. In: Hogg M.A. (ed) Social Psychology, Vol. I: Social Cognition and Social Perception. Sage Publications, London, pp xxi–lix
Google Scholar
Holmboe E.S. (2004). Faculty and the observation of trainees’ clinical skills: problems and opportunities. Academic Medicine 79(1): 16–22
Article Google Scholar
Hull A.L., Hodder S., Berger B., Ginsberg D., Lindheim N., Quan J., Kleinhenz M. (1995). Validity of three clinical performance assessments of internal medicine clerks. Academic Medicine 70(6): 517–522
Article Google Scholar
Jelley R.B., Goffin R.D. (2001). Can performance-feedback accuracy be improved? Effects of rater priming and rating-scale format on rating accuracy. Journal of Applied Psychology 86(1): 134–144
Article Google Scholar
Johnson J.W. (2001). The relative importance of task and contextual performance dimensions to supervisor judgements of overall performance. Journal of Applied Psychology 86(5): 984–996
Article Google Scholar
Johnston B. (2004). Summative assessment of portfolios: an examination of different approaches to agreement over outcomes. Studies in Higher Education 29(3): 395–412
Article Google Scholar
Judge T.A., Ferris G.R. (1993). Social context of performance evaluation decisions. Academy of Management Journal 36(1): 80–105
Article Google Scholar
Kahn M.J., Merrill W.W., Anderson D.S., Szerlip H.M. (2001). Residency program director evaluations do not correlate with performance on a required 4th-year objective structured clinical examination. Teaching and Learning in Medicine 13(1): 9–12
Article Google Scholar
Klimoski R., Inks L. (1990). Accountability forces in performance appraisal. Organizational Behavior and Human Decision Processes, 45: 194–208
Article Google Scholar
Krefting L. (1991). Rigor in qualitative research: the assessment of trustworthiness. American Journal of Occupational Therapy 45: 214–222
Google Scholar
Komatsu L.K. (1992). Recent views on conceptual structure. Psychological Bulletin 112(3): 500–526
Article Google Scholar
Kozlowski S.W.J., Mongillo M. (1992). The nature of conceptual similarity schemata: examination of some basic assumptions. Personality and Social Psychology Bulletin 18: 88–95
Google Scholar
Kwolek C.J., Donnelly M.B., Sloan D.A., Birrell S.N., Strodel W.E., Schwartz R.W. (1997). Ward evaluations: should they be abandoned? Journal of Surgical Research, 69(1): 1–6
Article Google Scholar
Lance C.E., LaPointe J.A., Stewart A.M. (1994). A test of the context dependency of three causal models of halo rater error. Journal of Applied Psychology 79(3): 332–340
Article Google Scholar
Lance C.E., Teachout M.S., Donnelly T.M. (1992). Specification of the criterion construct space: an application of hierarchical confirmatory factor analysis. Journal of Applied Psychology 77(4): 437–452
Article Google Scholar
Landy F.J., Farr J.L. (1980). Performance rating. Psychological Bulletin 87(1): 72–107
Article Google Scholar
Lievens F. (2001). Assessor training strategies and their effects on accuracy, interrater reliability and discriminant validity. Journal of Applied Psychology 86(2): 225–264
Article Google Scholar
Littlefield J.H., DaRosa D.A., Anderson K.D., Bell R.M., Nicholas G.G., Wolfson P.J. (1991). Assessing performance in clerkships: accuracy of surgery clerkship performance raters. Academic Medicine 66(9), S16–S18
Article Google Scholar
Longenecker C.O., Gioia D.A. (2000). Confronting the “politics” in performance appraisal. Business Forum, 25(3,4): 17–23
Google Scholar
van Luijk, S.J., van der Vleuten, C.P.M. & Schelven, R.M. (1990). The relation between content and psychometric characteristics in performance-based testing. In W. Bender, R.J. Hiemstra, A.J.J.A. Scherpbier & R.P. Zwierstra (eds.), Teaching and Assessing Clinical Competence, pp. 497–502. Groningen: Boekwerk Publications
McDowell L. (1995). The impact of innovative assessment on student learning. Innovations in Education and Training International, 32(4): 302–313
Google Scholar
McGaghie, W.C. (1993). Evaluating competence for professional practice. In: L. Curry, J.F. Wergin & Associates (eds.), Educating Professionals: Responding to New Expectations for Competence And Accountability, pp. 229–261. San Francisco: Jossey-Bass Inc., Publishers
McIlroy J.H., Hodges B., McNaughton N., Regehr G. (2002). The effect of candidates’ perceptions of the evaluation method on reliability of checklist and global rating scores in an objective structured clinical examination. Academic Medicine 77: 725–728
Article Google Scholar
Mero N.P., Motowidlo S.J. (1995). Effects of rater accountability on the accuracy and the favorability of performance ratings. Journal of Applied Psychology 80(4): 517–524
Article Google Scholar
Mero N.P., Motowidlo S.J., Anna A.L. (2003). Effects of accountability on rating behavior and rater accuracy. Journal of Applied Social Psychology 33(12): 2493–2514
Article Google Scholar
Messick S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher 23(2): 13–23
Article Google Scholar
Middendorf C.H., Macan T.H. (2002). Note-taking in the employment interview: effects on recall and judgments. Journal of Applied Psychology 87(2): 293–303
Article Google Scholar
Murphy K.R., Balzer W.K. (1986). Systematic distortions in memory-based behavior ratings and performance evaluation: consequences for rating accuracy. Journal of Applied Psychology 71: 39–44
Article Google Scholar
Murphy K.R., Balzer W.K. (1989). Rating errors and rating accuracy. Journal of Applied Psychology, 74(4): 619–624
Article Google Scholar
Murphy K.R., Cleveland J.N. (1995). Understanding Performance Appraisal. Social, Organizational and Goal-based Perspectives. Sage Publications, Thousand Oaks, CA
Google Scholar
Murphy K.R., Cleveland J.N., Skattebo A.L., Kinney T.B. (2004). Raters who pursue different goals give different ratings. Journal of Applied Psychology 89(1): 158–164
Article Google Scholar
Murphy K.R., Balzer W.K., Kellam K.L., Armstrong J. (1984). Effects of purpose of rating on accuracy in observing teacher behavior and evaluating teaching behavior. Journal of Educational Psychology 76: 45–54
Article Google Scholar
Nahum G.G. (2004). Evaluating medical student obstetrics and gynecology clerkship performance: which assessment tools are most reliable? American Journal of Obstetrics and Gynaecology 191: 1762–1771
Article Google Scholar
Nichols P.D., Smith P.L. (1998). Contextualizing the interpretation of reliability data. Educational Measurement: Issues and Practice 17: 24–36
Article Google Scholar
Noel G.L., Herbers J.E.J., Caplow M.P., Cooper G.S., Pangaro L.N., Harvey J. (1992). How well do internal medicine faculty members evaluate the clinical skills of residents? Annals of Internal Medicine 117: 757–765
Google Scholar
Norman G. (2005). Research in clinical reasoning: past history and current trends. Medical Education 39(4): 418–427
Article Google Scholar
Pangaro L.N. (2000). Investing in descriptive evaluation: a vision for the future of assessment. Medical Teacher 22(5): 478–481
Article Google Scholar
Petrusa E.R. (2002). Clinical performance assessments. In: G.R. Norman, C.P.M. van der Vleuten, D.I. Newble (eds), International Handbook of Research in Medical Education, Kluwer Academic Publishers, Dordrecht, pp.673–709
Google Scholar
Piggot-Irvine E. (2003). Key features of appraisal effectiveness. The International Journal of Educational Management 17(4): 170–178
Article Google Scholar
Prescott L.E., Norcini J.J., McKinlay P., Rennie J.S. (2002). Facing the challenges of competency-based assessment of postgraduate dental training: longitudinal evaluation of perfromance (LEP). Medical Education 36: 92–97
Article Google Scholar
Ramsey P.G., Wenrich M.D., Carline J.D., Inui T.S., Larson E.B., Logerfo J.P. (1993). Use of peer ratings to evaluate physician performance. Journal of the American Medical Association 269(13): 1655–1660
Article Google Scholar
Reznick R.K., Rajaratanam K. (2000). Performance-based assessment. In: L.H. Distlehorst, G.L. Dunnington, J.R. Folse (eds) Teaching and Learning in Medical and Surgical Education. Lessons Learned for the 21st Century, Lawrence Erlbaum Ass, Mahwah NJ, pp. 237–243
Google Scholar
Rothman A.J., Schwarz N. (1998). Constructing perceptions of vulnerability: personal relevance and the use of experiential information in health judgments. Personality and Social Psychology Bulletin 24(10): 1053–1064
Google Scholar
Rust C., O’Donovan B., Price M. (2005). A social constructivist assessment process model: how the research literature shows us this could be best practice. Assessment & Evaluation in Higher Education 30(3): 231–240
Article Google Scholar
Sanchez J.I., DeLaTorre P. (1996). A second look at the relationship between rating and behavioral accuracy in performance appraisal. Journal of Applied Psychology 81(1): 3–10
Article Google Scholar
Schleicher D.J., Day D.V. (1998) A cognitive evaluation of frame-of-reference rater training: content and process issues. Organizational Behaviour and Human Decision Processses 73(1): 76–101
Article Google Scholar
Schmidt H.G., Norman G.R., Boshuizen H.P.A. (1990). A cognitive perspective on medical expertise: theory and implications. Academic Medicine 65(10): 611–621
Article Google Scholar
Schwind C.J., Williams R.G., Boehler M.L., Dunnington G.L. (2004). Do individual attending post-rotation performance ratings detect resident clinical performance deficiencies? Academic Medicine 79: 453–457
Article Google Scholar
Siemer M., Reisenzein R. (1998). Effects of mood on evaluative judgements: influence of reduced processing capacity and mood salience. Cognition and Emotion 12(6): 783–805
Article Google Scholar
Silber C.G., Nasca T.J., Paskin D.L., Eiger G., Robeson M., Veloski J.J. (2004). Do global rating forms enable program directors to assess the ACGME competencies? Academic Medicine 79: 549–556
Article Google Scholar
Sloan D.A., Donnelly M.B., Drake D.B., Schwartz R.W. (1995). Faculty sensitivity in detecting medical students’ clinical competence. Medical Teacher 17(3): 335–342
Google Scholar
Speer A.J., Soloman D.J., Fincher R.M. (2000). Grade inflation in internal medicine clerkships: results of a national survey. Teaching and Learning in Medicine 12: 112–116
Article Google Scholar
Sulsky L.M., Keown J.L. (1999). Performance appraisal in the changing world of work: implications for the meaning and measurement of work performance. Canadian Psychology 39(1–2): 52–59
Google Scholar
Taylor M.S., Tracy K.B., Renard M.K., Harrison J.K., Carroll S.J. (1995). Due process in performance appraisal: a quasi-experiment in procedural justice. Administrative Science Quarterly 40: 495–523
Article Google Scholar
Tetlock P.E. (1983). Accountability and complexity of thought. Journal of Personality and Social Psychology 45: 74–83
Article Google Scholar
Tetlock P.E. (1985). Accountability: the neglected social context of judgment and choice. In: L.L. Cummings, B.M. Staw (eds) Research in Organizational Behavior Vol. 7, JAI Press, Greenwich, CT, pp 297–332
Google Scholar
Tigelaar D.E.H., Dolmans D.H.J.M., Wolfhagen I.H.A.P., van der Vleuten C.P.M. (2005). Quality issues in judging portfolios: implications for organizing teaching portfolio assessment procedures. Studies in Higher Education 30(5): 595–610
Article Google Scholar
Turnbull J., Barneveld van C. (2002). Assessment of clinical performance: in-training evaluation. In: G.R. Norman, C.P.M. van der Vleuten, D.I. Newble (eds), International Handbook of Research in Medical Education, Kluwer Academic Publishers, Dordrecht, pp. 793–810
Google Scholar
Verhulst S., Colliver J., Paiva R., Williams R.G. (1986). A factor analysis of performance of first-year residents. Journal of Medical Education 61: 132–134
Google Scholar
Vleuten van der C.P.M. (1996). The assessment of professional competence: developments, research and practical implications. Advances in Health Sciences Education 1: 41–67
Article Google Scholar
Vleuten van der C.P.M., Schuwirth L.W.T. (2005). Assessing professional competence: from methods to programmes. Medical Education 39: 309–317
Article Google Scholar
Vleuten van der C.P.M., Scherpbier A.J.J.A., Dolmans D.H.J.M., Schuwirth L.W.T., Verwijnen G.M., Wolfhagen H.A.P. (2000). Clerkship assessment assessed. Medical Teacher 22(6): 592–600
Article Google Scholar
Walsh J.P. (1995). Managerial and organizational cognition: notes from a trip down memory lane. Organization Science 6(3): 280–321
Article Google Scholar
Williams K.J., DeNisi A.S., Blencoe A.G., Cafferty T.P. (1985). The role of appraisal purpose: effects of purpose on information acquisition and utilization. Organizational Behavior and Human Performance 35: 314–339
Google Scholar
Williams R.G., Klamen D.A., McGaghie W.C. (2003). Cognitive, social and envrionmental sources of bias in clinical performance settings. Teaching and Learning in Medicine 15(4): 270–292
Article Google Scholar
Woehr D.J., Huffcutt A.I. (1994). Rater training for performance appraisal: a quantitative review. Journal of Occupational and Organisational Psychology 67: 189–205
Google Scholar
Zedeck S. (1986). A process analysis of the assessment center method. Research in Organizational Behavior 8: 259–296
Google Scholar
Zieky M.J. (2001). So much has changed: how the setting of cutscores has evolved since the 1980s. In G.J. Cizek (ed) Setting Performance Standard: Concepts, Methods and Perspectives, Lawrence Erlbaum Associates, Mahwah NJ, pp. 19–53
Google Scholar

Download references

Acknowledgements

The authors would like to thank Mereke Gorsira for critically reading and correcting the English manuscript.

Author information

Authors and Affiliations

Department of Educational Development and Research, Faculty of Medicine, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, The Netherlands
Marjan J. B. Govaerts, Cees P. M. van der Vleuten, Lambert W. T. Schuwirth & Arno M. M. Muijtjens

Authors

Marjan J. B. Govaerts
View author publications
You can also search for this author in PubMed Google Scholar
Cees P. M. van der Vleuten
View author publications
You can also search for this author in PubMed Google Scholar
Lambert W. T. Schuwirth
View author publications
You can also search for this author in PubMed Google Scholar
Arno M. M. Muijtjens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marjan J. B. Govaerts.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Govaerts, M.J.B., van der Vleuten, C.P.M., Schuwirth, L.W.T. et al. Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment. Adv Health Sci Educ Theory Pract 12, 239–260 (2007). https://doi.org/10.1007/s10459-006-9043-1

Download citation

Received: 28 September 2005
Accepted: 02 October 2006
Published: 10 November 2006
Issue Date: May 2007
DOI: https://doi.org/10.1007/s10459-006-9043-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment