Background
Methods
Identification of studies
Eligibility criteria
Search strategy
Search terms
Study selection
Data extraction and analysis
Results
Classification of tools according to the assessment of EBM practice
Tool | EBM steps | ||||
---|---|---|---|---|---|
Ask | Acquire | Appraise | Apply | Assess | |
Taylor’s questionnaire [14] | Yes | Yes | |||
Berlin [7] | Yes | ||||
Fresno [6] | Yes | Yes | Yes | ||
ACE [15] | Yes | Yes | Yes | Yes | |
Utrecht questionnaire U-CEP [16] | Yes | Yes | Yes | ||
MacRae examination [17] | Yes | ||||
EBM test [18] | Yes | Yes | Yes | ||
Educational prescription [19] | Yes | Yes | Yes | Yes | |
Mendiola-mcq [20] | Yes | ||||
Tudiver OSCE [21] | Yes | Yes | Yes | ||
Frohna’s OSCE [22] | Yes | Yes | Yes | Yes | |
BACES [23] | Yes |
Classification of tools according to the educational outcome domains measured
Outcome domains assessed by the twelve EBM instruments | |||||||
---|---|---|---|---|---|---|---|
Reaction to EBM teaching | Attitude | Self-efficacy | Knowledge | Skills | Behaviours | Patient benefit | |
Taylor’s questionnaire | Yes | Yes | |||||
Berlin | Yes | Yes | |||||
Fresno | Yes | Yes | |||||
ACE | Yes | Yes | |||||
Utrecht questionnaire U-CEP | Yes | ||||||
MacRae examination | Yes | Yes | |||||
EBM test | Yes | Yes | |||||
Educational prescription | Yes | Yes | |||||
Mendiola | Yes | ||||||
Tudiver OSCE | Yes | ||||||
Frohna’s OSCE | Yes | ||||||
BACES | Yes | Yes |
Quality of EBM tools and taxonomy
Tool | Reported psychometric properties | |||||||
---|---|---|---|---|---|---|---|---|
Content validity | Interrater reliability | Internal validity | Responsive validity | Discriminative validity | Construct Validity | Internal reliability (ITC) | External validity | |
Taylor’s questionnaire [14] | Yes | Yes | Yes | Yes | ||||
Berlin [7] | Yes | Yes | Yes | Yes | ||||
Fresno [6] | Yes | Yes | Yes | Yes | ||||
ACE [15] | Yes | Yes | Yes | Yes | Yes | |||
Utrecht questionnaire [16] | Yes | Yes | Yes | Yes | Yes | Yes | Yes | |
MacRae [17] | Yes | Yes | Yes | Yes | Yes |
Source instrument name and date | Instrument development-number of participants, level of expertise | EBM learning domains | Instrument description | EBM steps | Psychometric properties with results of validity and reliability assessment |
---|---|---|---|---|---|
Berlin questionnaire-Fritsche [7] | 266 participants—43 experts in evidence-based medicine, 20 controls (medical students) and 203 participants in evidence-based medicine course (USA) | Knowledge and skills | Berlin questionnaire was developed to measure basic knowledge about interpreting evidence from healthcare research, skills to relate a clinical problem to a clinical question, the best design to answer it and the ability to use quantitative information from published research to solve specific patient problems. The questions were built around clinical scenarios and has two separate sets of 15 multiple-choice questions mainly focusing on epidemiological knowledge and skills (scores range from 0 to 15) | Appraise | Content validity Internal validity Responsive validity Discriminative validity The two sets of questionnaires were psychometrically equivalent: interclass correlation coefficient for students and experts 0.96 (95% confidence interval 0.92 to 0.98, p < 0.001). Cronbach’s alpha 0.75 for set 1 and 0.82 for set 2. Ability to discriminate between groups with different levels of knowledge by comparing the three groups with varying expertise: The mean score of controls (4.2 (2.2)), course participants (6.3 (2.9)) and experts (11.9 (1.6)) were significantly different (analysis of variance, p < 0.001) |
Fresno test-Ramos et al. [6] | Family practice residents and faculty member (n = 43); volunteers self-identified as experts in EBM ( n = 53); family practice teachers (n = 19) (USA) | Knowledge and skills | Fresno test was developed and validated to assess medical professionals’ knowledge and skills. It consists of two clinical scenarios with 12 open-ended questions which are scored with standardised grading rubrics. Calculation skills were assessed by fill in the blank questions. | Ask, acquire and appraise | Content validity Interrater reliability Internal validity Discriminative validity Expert opinion Interrater correlations ranged from 0.76 to 0.98 for individual items Cronbach’s alpha was 0.88. ITC ranged 0.47–0.75. Item difficulties ranged from moderate (73%) to difficult (24%). Item discrimination ranged from 0.41 to 0.86. Construct validity, on the 212 point test, the novice mean was 95.6 and the expert mean was 147.5 (p< 0.001) |
MacRae [17] | Residents in University of Toronto General Surgery Program (n = 44) (Canada) | Knowledge and skills | Examination consisted of three articles each followed by a series of short-answer questions and 7-point rating scales to assess study quality. | Appraise | Content validity Interrater reliability Internal validity Discriminative validity Construct validity Cronbach’s alpha 0.77 Interrater reliability—Pearson product moment correlation coefficient between clinical epidemiologist and non-epidemiologist-0.91 between clinical epidemiologist and nurse 0.78.Construct validity was assessed by comparing scores of those who attended the journal club versus those who did not and by postgraduate year of training (p= 0.02) |
Taylor [14] Bradley et al. [24] | 4 groups of healthcare professionals (n = 152 ) with varying degrees of expertise of EBP (UK) Group 1—with no or little prior EBP education 2—undertaken CASP workshop within last 4 weeks; 3—undertaken CASP workshop in the last 12 months; 4—academics currently teaching EBP and attended 1997 Oxford CEBM workshop Later, Bradley et al. tried with 175 medical students in RCT of self-directed vs workshop-based EBP curricula (Norway) | Knowledge and attitudes | Questionnaire 11mcqs -true, false, do not know Correct responses given 1 Incorrect responses scored 1 Do not know 0 | Acquire and appraise | Content validity Internal validity Responsive validity Discriminative validity Cronbach’s alpha (0.72 for knowledge and 0.64 for attitude questions) Spearman’s correlation (internal consistency), total knowledge and attitudes scores ranged from 0.12 to 0.66, discriminative validity (novice and expert) Responsiveness (instrument able to detect change) |
ACE tool- Dragan Ilic [15] | 342 medical students—98 EBM-novice, 108 EBM-intermediate and 136 EBM-advanced participants (Australia) | Knowledge and skills | Assessing Competency in EBM (ACE )tool was developed and validated to evaluate medical trainees’ competency in EBM across knowledge, skills and attitudes—15 items, dochotomous outcome measure; items 1 and 2, asking the answerable question; items 3 and 4, searching literature; items 5–11 critical appraisal; items 12–15 relate to step 4 applying evidence to the patient scenario. | Ask, acquire, appraise and apply | Content validity Interrater reliability Internal validity Responsive validity Discriminative validity Construct validity—statistically significant linear trend for sequentially improved mean score corresponding to the level of training (p< 0.0001) Item difficulty ranged from 36 to 84%, internal reliability ranged from 0.14 to 0.20, item discrimination ranged from 0.37 to 0.84, Cronbach’s alpha coefficient for internal consistency was 0.69 |
Kortekaas-Utrecht questionnaire [16] (original questionnaire in Dutch, English version now available) | Postgraduate GP trainees (n=219), hospital trainees (n = 20), GP supervisors (n=20) academic GPs or clinical epidemiologists (n = 8) (Netherlands) | Knowledge | Utrecht questionnaire on knowledge on clinical epidemiology (U-CEP): two sets of 25 questions and a combined set of 50 | Ask, appraise and apply | Content validity Internal validity Responsive validity Discriminative validity Content validity—expert opinion and survey Construct validity—significant difference in mean score between experts, trainees and supervisors Internal consistency—Cronbach alpha 0.79 for set A, 0.80 for set B and 0.89 for combined Responsive validity—significantly higher mean scores after EBM training than before EBM training Internal reliability—ITC using Pearson product, median 0.22 for set A, 0.26 for set B and 0.24 for combined Item Discrimination ability—median-0.35 for set A, 0.43 for set B and 0.37 for combined |
Assessment category | Type of assessment | Steps of EBM | |||||
---|---|---|---|---|---|---|---|
7 | Benefits to patients | Patient-oriented outcomes | |||||
6 | Behaviours | Activity monitoring | |||||
5 | Skills | Performance assessment | Fresno ACE | Fresno ACE | Berlin’s Fresno ACE MacRae | ACE | |
4 | Knowledge | Cognitive testing | Fresno ACE U-CEP | Fresno ACE Taylor's | Taylor’s Berlins Fresno ACE UCEP MacRae | ACE UCEP | |
3 | Self-efficacy | Self-report/opinion | |||||
2 | Attitudes | Taylor's | Taylor’s | ||||
1 | Reaction to the educational experience | ||||||
Ask | Search | Appraise | Integrate | Evaluate |
Source instrument name and date | Instrument development, number of participants, level of expertise | EBM learning domains | Instrument description | EBM steps | Psychometric properties with results of validity and reliability assessment |
---|---|---|---|---|---|
Educational Prescription-David Feldstein [19] | 20 residents | Knowledge and skills | Educat academic GPs or clinical ional prescription (EP)—web-based tool that guides learners through the four As of EBM. Learners use the EP to define a clinical question, document a search strategy, appraise the evidence, report the results and apply evidence to the particular patient | Asking, acquiring, appraising, applying | Predictive validity Interrater reliability Interrater reliability on the 20 EPs showed fair agreement for question formation (k= 0.22); moderate agreement for overall competence (k = 0.57) and evaluation of evidence (k= 0.44). and substantial agreement for searching (k = 0.70) and application of evidence (k = 0.72) |
BACES-Barlow [23] | Yes postgraduate medical trainees/residents—150 residents | Knowledge, skills | BACES-Biostatistics and Clinical Epidemiology Skills (BACES) assessment for medical residents-30 multiple-choice questions were written to focus on interpreting clinical epidemiological and statistical methods | Appraisal—interpreting clinical epidemiology and statistical methods | Content validity was assessed through a four person expert review Item Response Theory (IRT) makes it flexible to use subsets of questions for other cohorts of residents (novice, intermediate and advanced). 26 items fit into a two parameter logistic IRT model and correlated well with their comparable CTT (classical test theory) values |
David Feldstein-EBM test [18] | 48 internal medicine residents | Knowledge and skills | EBM test—25 mcqs-covering seven EBM focus areas: (a) asking clinical questions, (b) searching, (c) EBM resources, (d) critical appraisal of therapeutic and diagnostic evidence, (e) calculating ARR, NNT and RRR, (f) interpreting diagnostic test results and (g) interpreting confidence intervals | Asking, acquiring and appraising Asking clinical questions, searching, EBM resources, critical appraisal, calculations of ARR, NNT, RRR, interpreting diagnostic test results and interpreting confidence intervals. | Construct validity Responsive validity EBM experts scored significantly higher EBM test scores compared to PGY-1 residents (p < 0.001), who in turn scored higher than 1st year students (p < 0.004). Responsiveness of the test was also demonstrated with 16 practising clinicians—mean difference in fellows’ pre-test to post-test EBM scores was 5.8 points (95% CI 4.2, 7.4) |
Frohna-OSCE [22] | Medical students (n-26) who tried the paper-based test during the pilot phase. A web-based station was then developed for full implementation (n = 140). | Skills | A web-based 20-min OSCE-specific case scenario where students asked a structural clinical question, generated effective MEDLINE search terms and elected the most appropriate of 3 abstracts | Ask, acquire, appraise and apply | Face validity Interrater reliability Literature review and expert consensus Between three scorers, there was good interrater reliability with 84, 94 and 96% agreement (k = 0.64, 0.82 and 0.91) |
Tudiver-OSCE [21] | Residents—first year and second year | Skills | OSCE stations | Ask, acquire, appraise and apply | Content validity Construct validity p= 0.43 Criterion validity p < 0.001 Interrater reliability ICC 0.96 Internal reliability Cronbach’s alpha 0.58 |
Mendiola-mcq [20] | Fifth year medical students | Knowledge | MCQ (100 questions) | Appraise | Reliability of the mcq = Cronbach’s alpha 0.72 in M5 and 0.83 in M6 group Effect size in Cohen’s d for the knowledge score main outcome comparison of M5 EBM vs M5 non-EBM was 3.54 |