GRADING – levels of evidence

Richards, Derek

doi:10.1038/sj.ebd.6400636

Toolbox
Published: 24 March 2009

GRADING – levels of evidence

Derek Richards¹

Evidence-Based Dentistry volume 10, pages 24–25 (2009)Cite this article

6616 Accesses
18 Citations
1 Altmetric
Metrics details

Abstract

As this journal changes, it is worth highlighting one the key elements of the Summaries we publish in Evidence-based Dentistry, namely the assignment of levels of evidence.

You have full access to this article via your institution.

Download PDF

Main

We assign levels of evidence to each main Summary published in Evidence-based Dentistry, with the exception of guidelines which contain a mix of levels and therefore present more of a challenge. The system we use in the journal is based on that employed by the Oxford Centre for Evidence-based Medicine (OCEBM) as shown in Table 1.

Table 1 Simplified version of the Oxford Centre for Evidence-based Medicine levels of evidence table^*

Full size table

The level of evidence we assign is highlighted using our evidence graphic (Figure 1). We will continue to use this system for the present, but it is worth mentioning some of the work that has taken place in the area over the past few years that may change the way we assign levels of evidence in the journal.

One of the first attempts to explicitly characterise a hierarchy of evidence was made by the Canadian Task Force on the Periodic Health Examination in 1979,¹ to link their healthcare recommendations with the strength of underlying evidence. Since then, Holger et al.² have identified more than 100 other groups that have used various systems of codes to communicate grades of evidence and recommendations. Glasziou and colleagues³ subsequently identified five issues that they believed should be addressed when looking at alternative approaches to identifying reliable evidence:

Different types of question require different types of evidence. For example, randomised controlled trials can give good estimates of treatment effects but poor estimates of prognosis.
Systematic reviews are preferable: studies, with rare exceptions, should not be interpreted in isolation, so pooling of study findings using standardised reporting is preferable.
Level alone should not be used to grade evidence. Although this approach helps to justify study selection, a number of disadvantages were identified by these authors, eg, levels may mean different things to different readers, and novel or hybrid approaches are not easily accommodated. This can lead to anomalous rankings, where a systematic review (usually the highest level) that is based on a few small poor quality trials might be placed above a large, well-conducted, multicentre trial.
What if there are no systematic reviews? Systematic reviews are only available for a small number of topics so whatever evidence is found should be clearly described.
Balanced assessment should draw on a variety of research. Even if the effectiveness of any particular treatment has good systematic evidence, data about potential harm is likely to come from cohort or case–control studies: risk–benefit assessments thus need to draw on a variety of research types.

These authors suggested that there were two broad options to address these concerns; to extend and improve existing hierarchies, or to abolish evidence hierarchies and levels of evidence and concentrate instead on teaching practitioners general principles of research so that they can use these principles to appraise the quality and relevance of particular studies. I would suggest that both are necessary.

In 2004, the GRADE (Grading of Recommendations Assessment, Development and Evaluation) working group published a critical appraisal of the six most prominent systems for grading levels of evidence and strength of recommendations,⁴ as follows:

The American College of Chest Physicians
Australian National Health and Medical Research Council
OCEBM
Scottish Intercollegiate Guidelines Network
US Preventive Services Task Force
US Task Force on Community Preventive Services

The working group found that there was poor agreement about the sense of the systems; all of the systems used were considered to have important shortcomings when attempting to grade levels of evidence and the strength of clinical recommendations. There was agreement that the OCEBM system worked well for all four types of questions (effectiveness, harm, diagnosis and prognosis) considered for the appraisal, although it was not without its faults.

This critical appraisal examined both the way these six systems rank the evidence and how they then grade the strength of clinical recommendations. A number of key conclusions were drawn, and a new scheme proposed. This has been adopted by the GRADE group to develop a new rating of quality and strength of evidence (Table 2).^{5, 6}

Table 2 GRADE: quality of evidence and definitions

Full size table

The GRADE approach to linking evidence and clinical recommendations has much to recommend it and it is likely that this will be an important system in the future — particularly in guideline development. There are of course differences between the role of this journal and guideline development: Evidence-based Dentistry identifies good quality articles and provides a commentary from a practitioner working in the area, whereas guidelines (particularly the better ones) are developed by a group that includes a number of topic specific and methodology experts. Guidelines groups are likely to have access to a very wide knowledge base and are thus well placed to apply the GRADE definitions effectively; more so than the smaller number of people employed in developing and preparing summaries for this journal. Consequently, we will continue to rate studies individually using the OCEBM approach (Table 1) for the foreseeable future. Readers who would like more information on GRADE can find this on their website (www.gradeworkinggroup.org).

References

Canadian Task Force on the Periodic Health Examination. The periodic health examination. Can Med Assoc J 1979; 121: 1193–1254.
Google Scholar
Guyatt GH, Oxman AD, Vist GE, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. Br Med J 2008; 336: 924–926.
Article Google Scholar
Glasziou P, Vandenbroucke JP, Chalmers I . Assessing the quality of research. Br Med J 2004; 328: 39–41.
Article Google Scholar
Atkins D, Eccles M, Flottorp S, et al. Systems for grading the quality of evidence and the strength of recommendations. I. Critical appraisal of existing approaches. The GRADE Working Group. BMC Health Services Res 2004; 4: 38.
Article Google Scholar
Atkins D, Briss PA, Eccles M, et al. Systems for grading the quality of evidence and the strength of recommendations. II. Pilot study of a new system. BMC Health Services Res 2005; 5: 25.
Article Google Scholar
Guyatt GH, Oxman AD, Vist GE, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. Br Med J 2008; 336; 924–926.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Editor,
Derek Richards

Authors

Derek Richards
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Richards, D. GRADING – levels of evidence. Evid Based Dent 10, 24–25 (2009). https://doi.org/10.1038/sj.ebd.6400636

Download citation

Published: 24 March 2009
Issue Date: March 2009
DOI: https://doi.org/10.1038/sj.ebd.6400636

This article is cited by

Oral health-related quality of life changes in children following dental treatment under general anaesthesia: a meta-analysis
- Joon Soo Park
- Robert P. Anthonappa
- Luc C. Martens
Clinical Oral Investigations (2018)
Rationale for using a double-wavelength (940 nm + 2780 nm) laser in endodontics: literature overview and proof-of-concept
- Miguel R. Martins
- Rene Franzen
- Norbert Gutknecht
Lasers in Dental Science (2018)
Classifying scientific evidence as the basis for evidence-based decision making: is strength of evidence absolute?
- Eyal Rosen
- Igor Tsesis
Evidence-Based Endodontics (2016)

GRADING – levels of evidence

Abstract

Main

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

This article is cited by

Oral health-related quality of life changes in children following dental treatment under general anaesthesia: a meta-analysis

Rationale for using a double-wavelength (940 nm + 2780 nm) laser in endodontics: literature overview and proof-of-concept

Classifying scientific evidence as the basis for evidence-based decision making: is strength of evidence absolute?

Search

Quick links

Abstract

Main

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Oral health-related quality of life changes in children following dental treatment under general anaesthesia: a meta-analysis

Rationale for using a double-wavelength (940 nm + 2780 nm) laser in endodontics: literature overview and proof-of-concept

Classifying scientific evidence as the basis for evidence-based decision making: is strength of evidence absolute?

Search

Quick links