Skip to main content

Advertisement

Log in

Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment

  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

Context

In-training assessment (ITA), defined as multiple assessments of performance in the setting of day-to-day practice, is an invaluable tool in assessment programmes which aim to assess professional competence in a comprehensive and valid way. Research on clinical performance ratings, however, consistently shows weaknesses concerning accuracy, reliability and validity. Attempts to improve the psychometric characteristics of ITA focusing on standardisation and objectivity of measurement thus far result in limited improvement of ITA-practices.

Purpose

The aim of the paper is to demonstrate that the psychometric framework may limit more meaningful educational approaches to performance assessment, because it does not take into account key issues in the mechanics of the assessment process. Based on insights from other disciplines, we propose an approach to ITA that takes a constructivist, social-psychological perspective and integrates elements of theories of cognition, motivation and decision making. A central assumption in the proposed framework is that performance assessment is a judgment and decision making process, in which rating outcomes are influenced by interactions between individuals and the social context in which assessment occurs.

Discussion

The issues raised in the article and the proposed assessment framework bring forward a number of implications for current performance assessment practice. It is argued that focusing on the context of performance assessment may be more effective in improving ITA practices than focusing strictly on raters and rating instruments. Furthermore, the constructivist approach towards assessment has important implications for assessment procedures as well as the evaluation of assessment quality. Finally, it is argued that further research into performance assessment should contribute towards a better understanding of the factors that influence rating outcomes, such as rater motivation, assessment procedures and other contextual variables.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Barneveld van C. (2005). The dependability of medical students’ performance ratings as documented on in-training evaluations. Academic Medicine 80(3): 309–312

    Article  Google Scholar 

  • Bernardin, H.J., Orban, J.A. & Carlyle J.J. (1981). Performance ratings as a function of trust in appraisal and rater individual differences. Academy of Management Proceedings: 311–315

  • Borman W.C., Motowidlo S.J. (1997). Task performance and contextual performance: the meaning for personnel selection research. Human Performance 10: 99–109

    Article  Google Scholar 

  • Cardy R.L., Bernardin H.J., Abbott J.G., Senderak M.P., Taylor K. (1987) The effects of individual performance schemata and dimension familiarization on rating accuracy. Journal of Occupational Psychology 60: 197–205

    Google Scholar 

  • Chi M.T.H., Glaser R., Farr M.J. (1989). The Nature of Expertise. Hillsdale, New Jersey

    Google Scholar 

  • Clauser B.E., Schuwirth L.W.T. (2002). The use of computers in assessment. In: G.R. Norman, C.P.M. van der Vleuten, D.I. Newble (eds), International Handbook of Research in Medical Education, Kluwer Academic Publishers, Dordrecht, pp.757–792

    Google Scholar 

  • Cleveland J.N., Murphy K.R., Williams R.E. (1989). Multiple uses of performance appraisal: prevalence and correlates. Journal of Applied Psychology 74: 130–135

    Article  Google Scholar 

  • Coderre S., Mandin H., Harasym P.H., Fick G.H. (2003). Diagnostic reasoning strategies and diagnostic success. Medical Education 37: 695–703

    Article  Google Scholar 

  • Crooks T. (1998). The impact of classroom evaluation practices on students. Review of Educational Research 58(4): 438–481

    Article  Google Scholar 

  • Delandshere G., Petrosky A.R. (1998). Assessment of complex performances: limitations of key measurement assumptions. Educational Researcher 27(2): 14–24

    Article  Google Scholar 

  • DeNisi A.S., Peters L.H. (1996). Organization of information in memory and the performance appraisal process: evidence from the field. Journal of Applied Psychology 81(6): 717–737

    Article  Google Scholar 

  • DeNisi A.S., Robbins T., Cafferty T.P. (1989). Organization of information used for performance appraisals: role of diary-keeping. Journal of Applied Psychology 74(1): 124–129

    Article  Google Scholar 

  • DeNisi A.S., Williams K.J. (1988). Cognitive approaches to performance appraisal. In: G. Ferris, K. Rowland (eds) Research in Personnel and Human Resource Management (Vol. 6). JAI Press, Greenwich, CT

    Google Scholar 

  • Driessen E., Vleuten van der C., Schuwirth L., Tartwijk van J., Vermunt J. (2005). The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: a case study. Medical Education, 39: 214–220

    Article  Google Scholar 

  • Erdogan B., Kraimer M.L., Liden R.C. (2001). Procedural justice as a two-dimensional construct. An examination in the performance appraisal context. Journal of Applied Behavioural Science 37(2): 205–222

    Google Scholar 

  • Eva K.W. (2004). What every teacher needs to know about clinical reasoning. Medical Education 39: 98–106

    Article  Google Scholar 

  • Fiske S.T., Taylor S.E. (1991). Social Cognition (2nd ed). McGraw-Hill, New York

    Google Scholar 

  • Forgas J.P., George J.M. (2001). Affective influences on judgments and behavior in organizations: an information processing perspective. Organizational Behavior and Human Decision Processes 86(1): 3–34

    Article  Google Scholar 

  • Forgas J.P. (2002). Feeling and doing: influences on interpersonal behavior. Psychological Inquiry 13(1): 1–28

    Article  Google Scholar 

  • Govaerts M.J.B., Vleuten van der C.P.M., Schuwirth L.W.T., Muijtjens A.M.M. (2005). The use of observational diaries in in-training evaluation: student perceptions. Advances in Health Sciences Education 10: 171–188

    Article  Google Scholar 

  • Gray J.D. (1996). Global rating scales in residency education. Academic Medicine 71(1): S55–S63

    Article  Google Scholar 

  • Greguras G.J., Robie C., Schleicher D.J., Goff M. III (2003). A field study of the effects of rating purpose on the quality of multisource ratings. Personnel Psychology 56: 1–20

    Article  Google Scholar 

  • Guba E., Lincoln Y. (1989). Fourth Generation Evaluation. Sage Publications, London

    Google Scholar 

  • Harris M. (1994). Rater motivation in the performance appraisal context: a theoretical framework. Journal of Management 20(4): 737–756

    Article  Google Scholar 

  • Hauenstein N.M.A. (1992). An information-processing approach to leniency in performance judgments. Journal of Applied Psychology 77(4): 485–493

    Article  Google Scholar 

  • Hawe E. (2003). It’s pretty difficult to fail: the reluctance of lecturers to award a failing grade. Assessment and Evaluation in Higher Education 28(4): 371–382

    Article  Google Scholar 

  • Hodgkinson G.P. (2003). The interface of cognitive and industrial, work and organizational psychology. Journal of Occupational and Organizational Psychology 76: 1–25

    Article  Google Scholar 

  • Hoffman K.G., Donaldson J.F. (2004). Contextual tensions of the clinical environment and their influence on teaching and learning. Medical Education 38: 448–454

    Article  Google Scholar 

  • Hogg M.A. (2003). Introducing social psychology. In: Hogg M.A. (ed) Social Psychology, Vol. I: Social Cognition and Social Perception. Sage Publications, London, pp xxi–lix

    Google Scholar 

  • Holmboe E.S. (2004). Faculty and the observation of trainees’ clinical skills: problems and opportunities. Academic Medicine 79(1): 16–22

    Article  Google Scholar 

  • Hull A.L., Hodder S., Berger B., Ginsberg D., Lindheim N., Quan J., Kleinhenz M. (1995). Validity of three clinical performance assessments of internal medicine clerks. Academic Medicine 70(6): 517–522

    Article  Google Scholar 

  • Jelley R.B., Goffin R.D. (2001). Can performance-feedback accuracy be improved? Effects of rater priming and rating-scale format on rating accuracy. Journal of Applied Psychology 86(1): 134–144

    Article  Google Scholar 

  • Johnson J.W. (2001). The relative importance of task and contextual performance dimensions to supervisor judgements of overall performance. Journal of Applied Psychology 86(5): 984–996

    Article  Google Scholar 

  • Johnston B. (2004). Summative assessment of portfolios: an examination of different approaches to agreement over outcomes. Studies in Higher Education 29(3): 395–412

    Article  Google Scholar 

  • Judge T.A., Ferris G.R. (1993). Social context of performance evaluation decisions. Academy of Management Journal 36(1): 80–105

    Article  Google Scholar 

  • Kahn M.J., Merrill W.W., Anderson D.S., Szerlip H.M. (2001). Residency program director evaluations do not correlate with performance on a required 4th-year objective structured clinical examination. Teaching and Learning in Medicine 13(1): 9–12

    Article  Google Scholar 

  • Klimoski R., Inks L. (1990). Accountability forces in performance appraisal. Organizational Behavior and Human Decision Processes, 45: 194–208

    Article  Google Scholar 

  • Krefting L. (1991). Rigor in qualitative research: the assessment of trustworthiness. American Journal of Occupational Therapy 45: 214–222

    Google Scholar 

  • Komatsu L.K. (1992). Recent views on conceptual structure. Psychological Bulletin 112(3): 500–526

    Article  Google Scholar 

  • Kozlowski S.W.J., Mongillo M. (1992). The nature of conceptual similarity schemata: examination of some basic assumptions. Personality and Social Psychology Bulletin 18: 88–95

    Google Scholar 

  • Kwolek C.J., Donnelly M.B., Sloan D.A., Birrell S.N., Strodel W.E., Schwartz R.W. (1997). Ward evaluations: should they be abandoned? Journal of Surgical Research, 69(1): 1–6

    Article  Google Scholar 

  • Lance C.E., LaPointe J.A., Stewart A.M. (1994). A test of the context dependency of three causal models of halo rater error. Journal of Applied Psychology 79(3): 332–340

    Article  Google Scholar 

  • Lance C.E., Teachout M.S., Donnelly T.M. (1992). Specification of the criterion construct space: an application of hierarchical confirmatory factor analysis. Journal of Applied Psychology 77(4): 437–452

    Article  Google Scholar 

  • Landy F.J., Farr J.L. (1980). Performance rating. Psychological Bulletin 87(1): 72–107

    Article  Google Scholar 

  • Lievens F. (2001). Assessor training strategies and their effects on accuracy, interrater reliability and discriminant validity. Journal of Applied Psychology 86(2): 225–264

    Article  Google Scholar 

  • Littlefield J.H., DaRosa D.A., Anderson K.D., Bell R.M., Nicholas G.G., Wolfson P.J. (1991). Assessing performance in clerkships: accuracy of surgery clerkship performance raters. Academic Medicine 66(9), S16–S18

    Article  Google Scholar 

  • Longenecker C.O., Gioia D.A. (2000). Confronting the “politics” in performance appraisal. Business Forum, 25(3,4): 17–23

    Google Scholar 

  • van Luijk, S.J., van der Vleuten, C.P.M. & Schelven, R.M. (1990). The relation between content and psychometric characteristics in performance-based testing. In W. Bender, R.J. Hiemstra, A.J.J.A. Scherpbier & R.P. Zwierstra (eds.), Teaching and Assessing Clinical Competence, pp. 497–502. Groningen: Boekwerk Publications

  • McDowell L. (1995). The impact of innovative assessment on student learning. Innovations in Education and Training International, 32(4): 302–313

    Google Scholar 

  • McGaghie, W.C. (1993). Evaluating competence for professional practice. In: L. Curry, J.F. Wergin & Associates (eds.), Educating Professionals: Responding to New Expectations for Competence And Accountability, pp. 229–261. San Francisco: Jossey-Bass Inc., Publishers

  • McIlroy J.H., Hodges B., McNaughton N., Regehr G. (2002). The effect of candidates’ perceptions of the evaluation method on reliability of checklist and global rating scores in an objective structured clinical examination. Academic Medicine 77: 725–728

    Article  Google Scholar 

  • Mero N.P., Motowidlo S.J. (1995). Effects of rater accountability on the accuracy and the favorability of performance ratings. Journal of Applied Psychology 80(4): 517–524

    Article  Google Scholar 

  • Mero N.P., Motowidlo S.J., Anna A.L. (2003). Effects of accountability on rating behavior and rater accuracy. Journal of Applied Social Psychology 33(12): 2493–2514

    Article  Google Scholar 

  • Messick S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher 23(2): 13–23

    Article  Google Scholar 

  • Middendorf C.H., Macan T.H. (2002). Note-taking in the employment interview: effects on recall and judgments. Journal of Applied Psychology 87(2): 293–303

    Article  Google Scholar 

  • Murphy K.R., Balzer W.K. (1986). Systematic distortions in memory-based behavior ratings and performance evaluation: consequences for rating accuracy. Journal of Applied Psychology 71: 39–44

    Article  Google Scholar 

  • Murphy K.R., Balzer W.K. (1989). Rating errors and rating accuracy. Journal of Applied Psychology, 74(4): 619–624

    Article  Google Scholar 

  • Murphy K.R., Cleveland J.N. (1995). Understanding Performance Appraisal. Social, Organizational and Goal-based Perspectives. Sage Publications, Thousand Oaks, CA

    Google Scholar 

  • Murphy K.R., Cleveland J.N., Skattebo A.L., Kinney T.B. (2004). Raters who pursue different goals give different ratings. Journal of Applied Psychology 89(1): 158–164

    Article  Google Scholar 

  • Murphy K.R., Balzer W.K., Kellam K.L., Armstrong J. (1984). Effects of purpose of rating on accuracy in observing teacher behavior and evaluating teaching behavior. Journal of Educational Psychology 76: 45–54

    Article  Google Scholar 

  • Nahum G.G. (2004). Evaluating medical student obstetrics and gynecology clerkship performance: which assessment tools are most reliable? American Journal of Obstetrics and Gynaecology 191: 1762–1771

    Article  Google Scholar 

  • Nichols P.D., Smith P.L. (1998). Contextualizing the interpretation of reliability data. Educational Measurement: Issues and Practice 17: 24–36

    Article  Google Scholar 

  • Noel G.L., Herbers J.E.J., Caplow M.P., Cooper G.S., Pangaro L.N., Harvey J. (1992). How well do internal medicine faculty members evaluate the clinical skills of residents? Annals of Internal Medicine 117: 757–765

    Google Scholar 

  • Norman G. (2005). Research in clinical reasoning: past history and current trends. Medical Education 39(4): 418–427

    Article  Google Scholar 

  • Pangaro L.N. (2000). Investing in descriptive evaluation: a vision for the future of assessment. Medical Teacher 22(5): 478–481

    Article  Google Scholar 

  • Petrusa E.R. (2002). Clinical performance assessments. In: G.R. Norman, C.P.M. van der Vleuten, D.I. Newble (eds), International Handbook of Research in Medical Education, Kluwer Academic Publishers, Dordrecht, pp.673–709

    Google Scholar 

  • Piggot-Irvine E. (2003). Key features of appraisal effectiveness. The International Journal of Educational Management 17(4): 170–178

    Article  Google Scholar 

  • Prescott L.E., Norcini J.J., McKinlay P., Rennie J.S. (2002). Facing the challenges of competency-based assessment of postgraduate dental training: longitudinal evaluation of perfromance (LEP). Medical Education 36: 92–97

    Article  Google Scholar 

  • Ramsey P.G., Wenrich M.D., Carline J.D., Inui T.S., Larson E.B., Logerfo J.P. (1993). Use of peer ratings to evaluate physician performance. Journal of the American Medical Association 269(13): 1655–1660

    Article  Google Scholar 

  • Reznick R.K., Rajaratanam K. (2000). Performance-based assessment. In: L.H. Distlehorst, G.L. Dunnington, J.R. Folse (eds) Teaching and Learning in Medical and Surgical Education. Lessons Learned for the 21st Century, Lawrence Erlbaum Ass, Mahwah NJ, pp. 237–243

    Google Scholar 

  • Rothman A.J., Schwarz N. (1998). Constructing perceptions of vulnerability: personal relevance and the use of experiential information in health judgments. Personality and Social Psychology Bulletin 24(10): 1053–1064

    Google Scholar 

  • Rust C., O’Donovan B., Price M. (2005). A social constructivist assessment process model: how the research literature shows us this could be best practice. Assessment & Evaluation in Higher Education 30(3): 231–240

    Article  Google Scholar 

  • Sanchez J.I., DeLaTorre P. (1996). A second look at the relationship between rating and behavioral accuracy in performance appraisal. Journal of Applied Psychology 81(1): 3–10

    Article  Google Scholar 

  • Schleicher D.J., Day D.V. (1998) A cognitive evaluation of frame-of-reference rater training: content and process issues. Organizational Behaviour and Human Decision Processses 73(1): 76–101

    Article  Google Scholar 

  • Schmidt H.G., Norman G.R., Boshuizen H.P.A. (1990). A cognitive perspective on medical expertise: theory and implications. Academic Medicine 65(10): 611–621

    Article  Google Scholar 

  • Schwind C.J., Williams R.G., Boehler M.L., Dunnington G.L. (2004). Do individual attending post-rotation performance ratings detect resident clinical performance deficiencies? Academic Medicine 79: 453–457

    Article  Google Scholar 

  • Siemer M., Reisenzein R. (1998). Effects of mood on evaluative judgements: influence of reduced processing capacity and mood salience. Cognition and Emotion 12(6): 783–805

    Article  Google Scholar 

  • Silber C.G., Nasca T.J., Paskin D.L., Eiger G., Robeson M., Veloski J.J. (2004). Do global rating forms enable program directors to assess the ACGME competencies? Academic Medicine 79: 549–556

    Article  Google Scholar 

  • Sloan D.A., Donnelly M.B., Drake D.B., Schwartz R.W. (1995). Faculty sensitivity in detecting medical students’ clinical competence. Medical Teacher 17(3): 335–342

    Google Scholar 

  • Speer A.J., Soloman D.J., Fincher R.M. (2000). Grade inflation in internal medicine clerkships: results of a national survey. Teaching and Learning in Medicine 12: 112–116

    Article  Google Scholar 

  • Sulsky L.M., Keown J.L. (1999). Performance appraisal in the changing world of work: implications for the meaning and measurement of work performance. Canadian Psychology 39(1–2): 52–59

    Google Scholar 

  • Taylor M.S., Tracy K.B., Renard M.K., Harrison J.K., Carroll S.J. (1995). Due process in performance appraisal: a quasi-experiment in procedural justice. Administrative Science Quarterly 40: 495–523

    Article  Google Scholar 

  • Tetlock P.E. (1983). Accountability and complexity of thought. Journal of Personality and Social Psychology 45: 74–83

    Article  Google Scholar 

  • Tetlock P.E. (1985). Accountability: the neglected social context of judgment and choice. In: L.L. Cummings, B.M. Staw (eds) Research in Organizational Behavior Vol. 7, JAI Press, Greenwich, CT, pp 297–332

    Google Scholar 

  • Tigelaar D.E.H., Dolmans D.H.J.M., Wolfhagen I.H.A.P., van der Vleuten C.P.M. (2005). Quality issues in judging portfolios: implications for organizing teaching portfolio assessment procedures. Studies in Higher Education 30(5): 595–610

    Article  Google Scholar 

  • Turnbull J., Barneveld van C. (2002). Assessment of clinical performance: in-training evaluation. In: G.R. Norman, C.P.M. van der Vleuten, D.I. Newble (eds), International Handbook of Research in Medical Education, Kluwer Academic Publishers, Dordrecht, pp. 793–810

    Google Scholar 

  • Verhulst S., Colliver J., Paiva R., Williams R.G. (1986). A factor analysis of performance of first-year residents. Journal of Medical Education 61: 132–134

    Google Scholar 

  • Vleuten van der C.P.M. (1996). The assessment of professional competence: developments, research and practical implications. Advances in Health Sciences Education 1: 41–67

    Article  Google Scholar 

  • Vleuten van der C.P.M., Schuwirth L.W.T. (2005). Assessing professional competence: from methods to programmes. Medical Education 39: 309–317

    Article  Google Scholar 

  • Vleuten van der C.P.M., Scherpbier A.J.J.A., Dolmans D.H.J.M., Schuwirth L.W.T., Verwijnen G.M., Wolfhagen H.A.P. (2000). Clerkship assessment assessed. Medical Teacher 22(6): 592–600

    Article  Google Scholar 

  • Walsh J.P. (1995). Managerial and organizational cognition: notes from a trip down memory lane. Organization Science 6(3): 280–321

    Article  Google Scholar 

  • Williams K.J., DeNisi A.S., Blencoe A.G., Cafferty T.P. (1985). The role of appraisal purpose: effects of purpose on information acquisition and utilization. Organizational Behavior and Human Performance 35: 314–339

    Google Scholar 

  • Williams R.G., Klamen D.A., McGaghie W.C. (2003). Cognitive, social and envrionmental sources of bias in clinical performance settings. Teaching and Learning in Medicine 15(4): 270–292

    Article  Google Scholar 

  • Woehr D.J., Huffcutt A.I. (1994). Rater training for performance appraisal: a quantitative review. Journal of Occupational and Organisational Psychology 67: 189–205

    Google Scholar 

  • Zedeck S. (1986). A process analysis of the assessment center method. Research in Organizational Behavior 8: 259–296

    Google Scholar 

  • Zieky M.J. (2001). So much has changed: how the setting of cutscores has evolved since the 1980s. In G.J. Cizek (ed) Setting Performance Standard: Concepts, Methods and Perspectives, Lawrence Erlbaum Associates, Mahwah NJ, pp. 19–53

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank Mereke Gorsira for critically reading and correcting the English manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marjan J. B. Govaerts.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Govaerts, M.J.B., van der Vleuten, C.P.M., Schuwirth, L.W.T. et al. Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment. Adv Health Sci Educ Theory Pract 12, 239–260 (2007). https://doi.org/10.1007/s10459-006-9043-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10459-006-9043-1

Keywords

Navigation