Background
In recent years, there has been increasing evidence from large randomized trials and systematic reviews showing that patients receiving acupuncture report better outcomes than patients receiving no treatment or usual care only (for example, [
1,
2]). A large trial on low back pain [
3] and a meta-analysis of migraine trials [
4] even found superiority over guideline-oriented conventional care. At the same time, many recent high-quality trials comparing true acupuncture with a sham acupuncture intervention found only minor or even no differences (see [
4‐
7] for systematic reviews). The interpretation of this evidence is controversial. Some authors argue that the better effects over no treatment and usual care are only due to the usual placebo effects and bias [
8]. Some authors argue that most sham acupuncture interventions are physiologically active [
9,
10], and others contend that sham acupuncture interventions might be associated with particularly potent nonspecific or placebo effects [
11,
12].
Treatment effects are considered specific if they are attributable solely, according to the theory of the mechanism of action, to the characteristic component of an intervention [
13,
14]. Effects which are associated with the incidental elements of an intervention are considered nonspecific effects (synonymous with placebo effects). Nonspecific effects are mostly thought to be due to psychobiological processes triggered by the overall therapeutic context [
15]. They have to be distinguished from the natural course of disease, regression to the mean, effects of being in a study, cointerventions and, as far as possible, from reporting and other biases [
16,
17]. The total effect of an intervention consists of both specific and nonspecific effects [
18].
Separating characteristic and incidental elements of an intervention is straightforward in pharmacology, but is difficult in other interventions such as psychotherapy [
19]. Acupuncture involves the insertion and manipulation of needles into defined points of the body. While a variety of mechanistic models exist, the exact mechanism of action is unclear [
20]. This makes it difficult to devise a placebo intervention which is both inert and indistinguishable and reliably separates specific and nonspecific effects. The frequent use of the term
sham intervention instead of
placebo partly reflects this problem. Sham interventions in clinical trials of acupuncture typically vary from "true" acupuncture in one or both of the following aspects [
21]: location of points (for example, stimulation of nonindicated points or outside known points) and skin penetration (for example, use of fixed telescope "placebo" needles with a blunt tip). If some or most of these sham interventions should indeed be physiologically active, such trials would not compare acupuncture to a placebo but to an active intervention, making it more difficult to detect significant differences.
This problem would also apply if (sham) acupuncture would be associated with more potent placebo effects than other interventions. Both invasive and noninvasive sham acupuncture interventions exert (like true acupuncture) mild painful stimuli. It has been hypothesized that such interventions might trigger enhanced placebo effects by simultaneously acting on sensory, cognitive and emotional levels [
12]. There is also evidence that the same sham acupuncture intervention can have quite different effects when provided in different contexts [
22]. Placebo research indicates that in many situations, the therapeutic context associated with an intervention matters more than the placebo intervention itself [
15]. The therapeutic context depends not only on the specific therapeutic ritual applied but also on experiences, attitudes and preferences of patients and providers, the patient-provider interaction, the setting and the cultural background [
11]. Given the positive attitudes and expectation toward complementary therapies, it seems possible that complex rituals such as acupuncture could provoke significant psychobiological responses.
The most straightforward way to investigate whether sham acupuncture is associated with larger effects than a pharmacological placebo would be in randomized trials including both these interventions. The only trial using such an approach indeed found a significant superiority of sham acupuncture [
23]. Another, albeit methodologically weaker, possibility is to compare differences between sham acupuncture interventions and no-treatment control groups in acupuncture trials with those of (other) placebos and no-treatment control groups in other trials. Hróbjartsson and Gøtzsche [
24‐
26] have repeatedly reviewed all available trials, including both a placebo or sham and a no-treatment group for any condition. The latest update of their Cochrane review includes a total of 234 trials. In a preplanned subgroup analysis, they found that studies using "physical placebos" (including sham acupuncture) reported larger placebo effects (standardized mean difference (SMD) -0.31; 95% confidence interval (CI) -0.41, -0.22) than studies using "pharmacological placebos" (SMD -0.10; 95% CI -0.20, -0.01) [
26]. In a reanalysis of their data, we separated the trials in which the physical placebo was sham acupuncture from those which used other physical placebos. Effect sizes were significantly larger in trials using sham acupuncture than in trials using other physical placebos (SMDs -0.41 (-0.56, -0.24) vs -0.26 (-0.37, -0.15);
P = 0.007) [
27].
The Cochrane review [
26] and our reanalysis of these data did not include a number of recent rigorous, large acupuncture trials which included both a sham group and a no-treatment group. Furthermore, these reviews did not investigate whether large nonspecific effects might make it difficult to detect specific effects. Therefore, we have performed a systematic review of acupuncture trials in any condition including both sham and no-treatment groups published through April 2010. Our primary aim was to investigate the size of nonspecific effects of acupuncture (difference between sham acupuncture vs no acupuncture). Our secondary aims were to investigate factors (such as type of sham intervention, condition, study quality or intensity of cointerventions) possibly influencing the size of such nonspecific effects and to quantify specific (difference acupuncture vs sham acupuncture) and total effects of acupuncture (difference acupuncture vs no acupuncture) in the included trials.
Methods
Selection criteria
To be included, studies had to meet the following criteria: (1) allocation to groups was explicitly randomized; (2) participants were persons treated for any illness or for preventative purposes; trials in healthy volunteers measuring physiological outcomes were excluded; (3) intervention involving the insertion of needles described as acupuncture at acupuncture points, pain or trigger points with or without stimulation; trials on interventions without skin penetration (for example, laser acupuncture) were excluded; (4) sham interventions described as sham, placebo, dummy or fake treatment which differed from true acupuncture in at least one of two key aspects (skin penetration or point location); (5) no-acupuncture control group had to be a second control group in which participants received neither true nor sham acupuncture; participants could be either completely untreated or receive treatments which were also administered in the true and sham acupuncture groups (for example, rescue medication, basic treatment or routine care); and (6) a clinical outcome for which the calculation of an effect size estimate was possible.
Data sources and searches
To identify potentially relevant studies, we searched MEDLINE (from 1966 to April 2010) and Embase (from 1988 to April 2010) for all sham-controlled trials of acupuncture (see Additional file
1, Search strategies). Furthermore, we searched the Cochrane Central Register of Controlled Trials using a search strategy based on a Cochrane review of randomized trials with placebo and no-treatment controls in all medicine [
25]. While Chinese trials identified by our search were eligible, we did not search specific Chinese databases. One reviewer screened titles and abstracts of all references identified and excluded those which were clearly irrelevant. Full texts of all remaining articles were obtained and assessed independently for eligibility by two reviewers. Disagreements or uncertainties were resolved by discussion.
Data extraction and quality assessment
One reviewer extracted information on the following aspects from included studies using a standard form: diagnosis; recruitment; number and type of study centers; number and types of intervention and control groups; details on acupuncture and sham interventions; how patients were informed about these interventions; qualification of acupuncturists; cointerventions; study duration, number of patients randomized, analyzed and dropping out (per group); age; gender; results on the main outcome measures; important secondary outcomes and responder data. A second reviewer checked all extraction of study results against the original publications. Trials were considered to have lower risk of bias if they reported an adequate method of randomization concealment and had a dropout rate below 15% [
28]. For our main analyses, we used the following strategy to choose the outcome: (1) it should be a continuous outcome (mean and standard deviation available, or the standard deviation could be calculated from standard errors or confidence intervals, for example; we did not impute standard deviations for studies without available data on variability or precision); (2) the timing should be as close as possible to the completion of treatment; (3) when there was a clearly predefined main outcome measure, we chose this measure (but always preferred the measurement at the end of treatment over other time points or change from baseline); (4) when there was no predefined single main outcome measure, two reviewers independently chose the outcome considered most important (two disagreements were resolved by discussion); (5) If available, we used intention to treat data; otherwise, we used the data as presented in the publication. If a trial had more than one intervention (for example, an individualized and a standardized intervention) or more than one sham group, the data were pooled. For more recent studies, we tried to contact authors to inquire for further information if data for meta-analysis were missing.
Data synthesis and analysis
The Cochrane Collaboration's Review Manager RevMan 5 software was used for meta-analyses. Three comparisons were investigated: sham acupuncture versus no acupuncture (primary comparison), acupuncture versus sham acupuncture, and acupuncture versus no acupuncture. Studies were categorized into the clinical categories of chronic pain studies, short-term studies (that is, studies with an observation period of less than 3 days), and other studies.
The main analysis was based on trials reporting a continuous outcome measure using the standardized mean difference (SMDs; difference between the means/pooled standard deviation) as an effect size estimate. As we assumed that studies would be clinically heterogeneous, a random effects model with the inverse variance method was used for meta-analysis. Negative SMDs indicated a beneficial effect of sham acupuncture over no acupuncture, acupuncture over sham acupuncture and acupuncture over no acupuncture, respectively. SMDs ≤ -0.4 were considered small effects, those between -0.41 and -0.7 were considered moderate effects and those > -0.7 were considered large effects [
29]. To investigate statistical heterogeneity, RevMan 5 uses Tau
2, Chi
2 and I
2. We considered I
2 values between 30% and 60% as indicating moderate heterogeneity and higher values as indicating substantial heterogeneity. Subgroup comparisons were performed using the method described by Deeks
et al. [
30] and implemented in RevMan 5. Egger's test was used to assess funnel plot asymmetry [
31].
To check the robustness of results, we performed sensitivity analyses (1) including three-armed studies which had been excluded because they did not meet all inclusion criteria, but still could be considered because they addressed the questions investigated in this review ("borderline" studies; see Results); (2) using different outcomes for studies with more than one relevant outcome at the completion of treatment; and (3) using dichotomous outcome measures (with a relative risk <1 indicating a beneficial effect).
For exploratory analyses, we defined further subgroups: larger (at least 100 patients) and smaller (< 100 patients) comparisons; lower and higher risk of bias (see data extraction and quality assessment); studies with intense or less intense cointerventions in all study arms, with and without skin penetration (and depending on where needles were placed) in sham groups; studies with and without a clearly defined main outcome measure; and studies describing sham in the consent procedure as another treatment or placebo. In multivariate random effects meta-regression analyses, we investigated simultaneously the influence of risk of bias, cointerventions, skin penetration in the sham group and condition (chronic pain vs others). Analyses were carried out using the restricted information maximum likelihood (REML) method. For meta-regression analyses, PASW versions 17.0 and 18.0 software (SPSS, Chicago, IL, USA) using additional macros described by Wilson was used [
32]. To investigate the hypothesis that there is an inverse correlation between specific and nonspecific effects (that is, trials with large nonspecific effects are less likely to find large specific effects than are trials with small nonspecific effects), we performed a linear regression analysis using the inverse of the squared pooled standard error as a weighting factor.
Conclusions
Sham acupuncture interventions are often associated with moderately large nonspecific effects, which could make it difficult to detect small additional specific effects. Compared to inert placebo interventions, effects associated with sham acupuncture might be larger, which would have considerable implications for the design and interpretation of clinical trials. Total effects of acupuncture interventions including both specific and nonspecific effects often seem to be at least moderate in size. We believe that there has to be a discussion involving scientists, decision makers, health care providers and patients whether and when the evidence for clinically relevant total effects from nonblinded comparisons is sufficient to consider a treatment effective, even if specific effects due to the postulated mechanism of action might be minor or even nonexistent.
Competing interests
KL received travel reimbursement and fees for speaking at conferences organized by acupuncture societies in the USA, UK, Germany, Japan and Spain. Antonius Schneider received fees for lecturing for a German acupuncture society (DÄGfA) until 2006. KM and KN do not have any conflicts of interest.
Authors' contributions
KL, KN and KM were involved in the literature search, data extraction and analysis. AS provided advice on acupuncture and participated in the interpretation of the data. KL conceived and coordinated the study and wrote the first draft of the manuscript. All authors commented on drafts and approved the final manuscript.