Main

Acute pain has been studied in single dose designs first proposed by Beecher and colleagues1,2 and formalised by Houde and Wallenstein.3 The problem with single trials is that while they can demonstrate statistical superiority of analgesic over placebo, variation because of random chance means that, if small, they provide a poor estimate of the size of the analgesic effect.4 Combining results from clinically homogeneous trials in a meta-analysis gives an accurate estimate of the extent of the analgesic effect when sufficient numbers of patients have been randomised.4,5

Clinical trials in acute pain normally last 4 to 6 hours, because that is the duration of effect of most analgesics, whether injected or as tablets, and whether simple analgesics, NSAIDs or opioids. Meta-analysis in acute pain has concentrated on the use of the area under the total pain relief versus time curve (TOTPAR), dichotomized into those patients who do or do not achieve at least half pain relief (at least 50% maximum TOTPAR).6 This measure is the one most frequently reported, and it avoids the problem of reporting continuous pain data as the mean of a highly skewed distribution.7 It has the benefit of being intuitively meaningful to patients and professionals, as well as being measurable.

Meta-analyses in acute pain usually combine studies from a variety of pain models, and relative efficacy of analgesics in these studies has been examined.6 A majority of studies were in third molar extraction, but any postoperative pain condition is likely to be included. In the largest dataset, that of aspirin,8 pain model (dental or other surgery) made no difference to the NNT.

Dentists ask, rightly, about relative efficacy in dental pain. The number of prescription items issued by dentists in England was remarkably consistent between 1998 and 2001 (Table 1), with ibuprofen, dihydrocodeine and paracetamol being most frequently prescribed. This review set out to examine single-dose oral analgesics after third molar extraction from a number of updated systematic reviews, both for the analgesics commonly prescribed in England, and for those for which comparable evidence exists, including the newer cyclo-oxygenase-2 selective inhibitors like rofecoxib, celecoxib and valdecoxib.

Table 1 Numbers of prescription items issued by dentists and dispensed in England

Methods

In all the systematic reviews QUORUM guidelines were followed.9 In the reviews, studies for inclusion were sought through searching the Cochrane Library, Biological Abstracts, MEDLINE, PubMed and the Oxford Pain Relief database.10 Reference lists and review articles were examined for possible additional references. Most had search dates in 2002.

References for the reviews are as follows:

Aspirin: Edwards et al., 1999;8 additional searching in 2002 found no new studies.

Celecoxib: An unpublished review being submitted as a Cochrane Review.

Diclofenac: An updated version of a Cochrane review.11

Dihydrocodeine: A Cochrane review.12

Ibuprofen: An updated version of a Cochrane review.11

Paracetamol: An updated version of a Cochrane review.13

Paracetamol plus codeine: An updated version of a Cochrane review.13

Rofecoxib: An updated version of a systematic review.14

Valdecoxib: A systematic review in preparation.

Criteria for inclusion for postoperative dental pain were: study in third molar extraction, full journal publication (except valdecoxib which included information from a poster), randomised controlled trials which included single dose treatment groups of oral analgesic and placebo, double blind design, baseline postoperative pain of moderate to severe intensity, patients over 15 years of age, at least 10 patients per group, and the pain outcome measures of total pain relief (TOTPAR) or summed pain intensity difference (SPID) over 4-6 hours or sufficient data to allow their calculation. Pain measures allowed for the calculation of TOTPAR or SPID were a standard five point pain relief scale (none, slight, moderate, good, complete), a standard four point pain intensity scale (none, mild, moderate, severe) or a standard visual analogue scale (VAS) for pain relief or pain intensity. For adverse events, the primary outcome sought was the proportion of patients experiencing any adverse event, with secondary outcomes of patients experiencing particular adverse events. Although adverse events are often reported inconsistently in acute pain trials,15 the outcome of any patient experiencing any adverse event was the most consistently reported.

Each report which could possibly be described as a randomised controlled trial was read independently by several authors and scored using a commonly-used three item, 1-5 score, quality scale.16 Consensus was then achieved. The maximum score of an included study was 5 and the minimum score was 2. Authors were not blinded because they already knew the literature. This scoring system takes account of randomisation, blinding, and withdrawals and drop outs. Trials that score 3 or more (less biased) have been shown repeatedly to have lower treatment effects than those scoring 2 or less.17,18

For each trial, mean TOTPAR, SPID, VAS-TOTPAR or VAS-SPID values for each treatment group were converted to %maxTOTPAR by division into the calculated maximum value.19 The proportion of patients in each treatment group who achieved at least 50% maxTOTPAR was calculated using valid equations.20,21,22 The number of patients randomised was taken as the basis for calculations, to produce an intention to treat analysis. The number of patients with at least 50% maxTOTPAR was then used to calculate relative benefit and NNT for analgesic versus placebo. The same methods were used for adverse events, where the number needed to harm (NNH) was calculated.

Relative benefit and relative risk estimates were calculated with 95% confidence intervals using a fixed effects model.23 Heterogeneity tests were not used as they have previously been shown to be unhelpful,24,25 though homogeneity was examined visually.26 Publication bias was not assessed using funnel plots as these tests have been shown to be unhelpful.27,28 The number needed to treat or harm (NNT and NNH) with confidence intervals was calculated by the method of Cook and Sackett29 from the sum of all events and patients for treatment and placebo.

Relative benefit or risk was considered to be statistically significant when the 95% confidence interval did not include 1. NNT values were only calculated when the relative risk or benefit was statistically significant, and are reported with the 95% confidence interval. Calculations were performed using Microsoft Excel 2001 on a Power Macintosh G4.

Results

Table 2 shows the main results from 14,150 patients in 155 trials of 15 drug and dose combinations against placebo in third molar extractions. Of those 15 drug and dose combinations, only dihydrocodeine 30 mg could not be statistically distinguished from placebo because there were no trials with any useful information in third molar extraction. Figure 1 shows the proportion of patients achieving at least 50% pain relief with treatment. The smallest sample for any comparison was 136 patients for celecoxib 200 mg. Only five of the 14 comparisons had more than 1,000 patients, and seven had fewer than 500 patients. In all systematic reviews, the majority of trials had quality scores of 3 or more.

Table 2 Efficacy of analgesics after third molar extraction, from systematic reviews of randomised, double blind trials
Figure 1
figure 1

The 95% confidence interval of the proportion of patients with at least half pain relief over 4–6 hours compared with placebo in third molar extraction trials

The lowest (best) NNTs were for NSAIDs and COX-2 inhibitors at standard or high doses. For these, NNTs could be as low as about 2 (meaning that two patients had to be treated with NSAID or COXIB for one of them to have an outcome of at least half pain relief that would not have occurred with placebo). Valdecoxib 20 mg and 40 mg, rofecoxib 50 mg, ibuprofen 400 mg and diclofenac 50 mg and 100 mg all had NNTs below 2.4. For all of them, about 60-70% of patients had at least half pain relief with active treatment compared with about 10% with placebo.

Paracetamol 975/1,000 mg, aspirin 600/650 mg and paracetamol 600/650 mg had NNTs of between 4 and 5. Fewer than 40% of patients with paracetamol at these doses had at least half pain relief with active treatment. With dihydrocodeine 30 mg only 16% of patients had at least half pain relief with active treatment in one small trial in dental pain (Table 2).

The adverse event outcome of a patient experiencing any adverse event is shown in Table 3, from 10,113 patients in 107 trials. Of the 15 drug and dose combinations, only paracetamol 600/650 mg plus codeine 60 mg could be statistically distinguished from placebo in 10 trials and 824 patients. The NNH was 5.3 (4.1 to 7.4), indicating that five patients had to be treated with paracetamol 600/650 mg plus codeine 60 mg for one of them to have an adverse event that would not have occurred with placebo. For all other drugs and doses there was no difference between analgesic and placebo.

Table 3 Patients experiencing any adverse event with analgesics after third molar extraction, from systematic reviews of randomised, double blind trials

Discussion

Systematic review and meta-analysis both depend on two interdependent criteria for them to make sense: the quality of the component individual studies, and the total size of the sample available for analysis.

We know that if trials are of poor reporting quality,17,18 or not randomised,30 or not blind, or both,31 then the tendency is to over-estimate the benefits of treatment. The reviews included here all demanded that trials should be both randomised and double-blind as a minimum requirement for inclusion.

We also know that even if trials are done well, small sample size can lead to an incorrect answer just because of the random play of chance.4 For these studies we also know just how much information is needed to be 95% confident of an NNT to with ±0.5 units.4 With an NNT of 2.3 it is 400 patients in the comparison, with an NNT of 2.9 it is 1,000 patients, and with an NNT of 4.2 it is many more than with 1,000. At NNTs of 4 or more, even with 1,000 patients fewer than 75% of trials will be within ±0.5 of the overall NNT.

The analgesics for which these two criteria were met unequivocally were valdecoxib (combining 20 mg and 40 mg), rofecoxib 50 mg, ibuprofen 400 mg, diclofenac 50 mg and probably ibuprofen 200 mg. For paracetamol the numbers were borderline, and for diclofenac 100 mg too small to make any safe judgement. There is good evidence of good efficacy for the analgesics most commonly prescribed by dentists, with the exception of dihydrocodeine where there was little evidence in total, and no convincing evidence of efficacy.

With the information available, standard doses of NSAIDs and COX-2 inhibitors provided the best analgesia (Fig. 2). NNTs of 2 and below are indicative of very effective medicines.32 The indirect comparisons that allow us to arrive at this conclusion are only sustainable because the trials have the same design, use patients with the same entry criterion (moderate or severe pain intensity), with standard measurements made in the same way over the same period of time, and with the same output from each trial, and one known to be legitimate. The validity of the indirect comparisons are buttressed by the dose response of two doses of ibuprofen (400 mg was better than 200 mg) and two doses of paracetamol (975/1000 mg was better than 600/650 mg) where there were credible amounts of information. A systematic review of ibuprofen versus paracetamol in dental studies also concluded that ibuprofen was superior, concordant with the indirect comparison.33

Figure 2
figure 2

The 95% confidence interval of the number needed to treat (NNT) for at least half pain relief over 4–6 hours compared with placebo in third molar extraction trials

The adverse event information we have tells us only about patients experiencing any adverse event. With the amount of information available, it appears that only higher doses of codeine with paracetamol resulted in a significantly higher rate for this outcome than placebo. In Table 3 the rate at which this adverse event occurred with placebo varied greatly, between 2% and 52%. This variation will be due partly to small sizes,4 but also because we know that methods of collecting adverse event data impact significantly on the reported incidence, and because methods used varied.15

Information about specific adverse events is even more difficult to obtain, and very large data sets are required to produce information about, for instance, gastric irritation with aspirin use.8 Of interest to dentists might be the rate of alveolitis or dry socket. This is reported in some of the newer COXIB studies, but not in older studies. There is just too little information to make a judgement.

What these comparisons do not do is to tell dentists what to prescribe. They, and the products of other systematic reviews, should not be used as rules, but rather as evidence-based tools to help make better policy decisions, and decisions about individual patients. Present prescribing practice (Table 1) shows that, for the most part, effective and safe analgesics are being used in 80% of prescriptions. The exception is prescribing dihydrocodeine 30 mg (20% of prescriptions), for which we lack single dose evidence of efficacy in dental surgery, and which could not be distinguished from placebo in other conditions, again with little data.6

Not all the analgesics in this review are presently available for prescribing by dentists, at least in the UK. The information on efficacy, on harm, and on the amount of information available should be useful in any initiative to develop a prescribing formulary in dentistry, especially as we have growing confidence in the value of indirect comparisons.34

Competing Interests

RAM has been a consultant for MSD. RAM & HJM have consulted for various pharmaceutical companies. RAM, HJM & JE have received lecture fees from pharmaceutical companies related to analgesics and other healthcare interventions. All authors have received research support from charities, government and industry sources at various times, but no such support was received for this work. No author has any direct stock holding in any pharmaceutical company.

Authors' Contributions

JB was involved with searching, data extraction, quality scoring, analysis and writing. JE was involved with searching, data extraction, quality scoring and writing. HJM was involved in analysis and writing. RAM was involved in data extraction, quality scoring, analysis and writing. PW was involved with the original concept, with researching, prescribing data and writing.