Introduction
Cervical radiculopathy is caused by encroachment of a cervical nerve root, usually by bone or disc herniation. Typical symptoms are neck pain radiating into the arm, possible loss of motor function and/or sensory loss [
1]. The most common surgical treatment is discectomy and fusion [
2] either as stand-alone implant surgery, or with the addition of anterior plating [
3]. Anterior cervical discectomy is one of the most frequently performed spinal procedures. In the US, almost 550,000 patients were operated on between 2005 and 2008 [
4]. Concern that fusion may cause adjacent segment disease [
5] has given rise to motion preserving implants (arthroplasty). In the US, cervical arthroplasty surgery increased by 708% between 2005 and 2008 [
4].
Multiple trials [
6‐
14] and three recent meta-analyses [
15‐
17] have compared the results of arthroplasty versus fusion. Most authors concluded with clinical outcome in favor of arthroplasty [
6‐
12,
15‐
17]. However, few trials included blinding of patients [
15], and blinding was only performed until just after the surgical procedure was completed [
7,
11]. Only one study implemented blinding of the surgical team [
14]. So far, no studies have demonstrated clinical outcome in favor of fusion.
The aim of the Norwegian Cervical Arthroplasty Trial (NORCAT) was to assess 2-year clinical outcome in patients operated for single-level cervical radiculopathy with either arthroplasty or fusion.
Methods
Study design
Patients with single-level radiculopathy were included from November 2008 to January 2013 at five neurosurgical departments in Norway. The surgical procedure was either arthroplasty or fusion. The randomization was stratified according to center, and blocked using the Unit of Applied Clinical Research website (
http://www.ntnu.edu/dmf/akf/randomisering), to ensure equality in the groups. The study was designed to include 146 patients. Follow-up visits were scheduled at 3 months, 1 and 2 years. At 6 months, the patients answered the questionnaires by mail. Participating patients were blinded to the treatment until the last follow-up was completed.
The NORCAT received a grant from DePuy Synthes Spine (325 Paramount Drive Raynham, MA 02767). However, the sponsor was not involved in study design, conducting the trial, writing or reviewing the manuscript. The grant was unrestricted and the sponsor had no right of refusal for publication of the data. The sponsor read the manuscript before submission.
Participants
Inclusion criteria were: age between 25 and 60 years, clinical C6 or C7 radiculopathy with corresponding radiological findings, Neck Disability Index (NDI) [
18] ≥30%, no response to non-operative treatment, and no clinical improvement during the six weeks prior to surgery. Exclusion criteria were: significant spondylosis involving more than one level, adjacent level ankylosis, intramedullary changes on magnetic resonance imaging (MRI), and myelopathy. The complete list of inclusion and exclusion criteria is available in Table S1 in the supplementary appendix.
Study interventions
Discectomy via anterolateral approach was performed. The surgical team was blinded to the result of randomization until nerve root decompression was completed. Both arthroplasty [DISCOVER® prosthesis (DePuy Spine Inc., 325 Paramount Dr, Raynham, MA 02767, USA)], and fusion [CERVIOS® cage (Synthes GmbH, Eimattstrasse 3, 4436 Oberdorf, Switzerland)] implant systems were available in the operating theater.
Arthroplasty
The DISCOVER® prosthesis allows for unconstrained motion. Two titan plates are fixed to the endplates with a polyethylene inlay. Fluoroscopy was used to ensure that the prosthesis was placed in the midline and sufficiently towards the posterior edge of the vertebra. The appropriate size of implants was determined with templates.
Fusion
The CERVIOS® cage was used to achieve anterior cervical interbody fusion. The cage was preloaded with chronOS and the procedure was performed as stand-alone surgery.
Outcome measures
Primary outcome
The NDI is a self-rated questionnaire developed for patients with neck disability. The questionnaire is composed of ten items: seven related to activities of daily living (personal care, lifting, reading, work/daily activities, driving, sleep, and recreation), two to pain (pain, headache), and one to concentration. Each item is rated from 0 to 5. The NDI summary score ranges from 0 to 50. We expressed the score as a percentage with lower scores indicating less severe symptoms. We used the validated Norwegian version [
19].
Secondary outcomes
Secondary outcome measures were the Numeric Rating Scale (NRS 11) [
20], the Short Form 36 (SF-36) [
21], and the EuroQol-5 Dimension-3 Level (EQ-5D-3L) [
22]. In addition, data regarding the surgical procedure (duration of surgery, duration of anesthesia, and total blood loss), perioperative major complications (dural tear, damage of n. laryngeus recurrens, index level nerve, esophagus, trachea or large vessel), Dysphagia Short Questionnaire [
23], reoperations at index level within 2 years, and work status were recorded.
NRS 11 is a one-dimensional pain scale from 0 (“no pain at all”) to 10 (“worst imaginable pain”), used to evaluate arm and neck pain.
SF-36 is a generic questionnaire measuring health-related quality of life along eight dimensions (physical function, role limitations due to physical problems, bodily pain, general health, vitality, social function, role limitations due to emotional problems, and mental health) with two summary scores (physical component summary [PCS], mental component summary [MCS]). The score ranges from 0 to 100, with higher scores relating to better health. We used the validated Norwegian (chronic) version 2.0 [
24].
The EQ-5D-3L is a generic quality of life questionnaire with five dimensions (mobility, self-care, activities of daily life, pain, and anxiety/depression), ranging from −0.59 to 1. Higher scores indicate better health status. We used the validated Norwegian version [
25], and syntax files obtained from the EQ-5D society using the UK time trade off tariff to calculate the utility index [
26].
The Dysphagia Short Questionnaire consists of five items (ability to swallow, incorrect swallowing, globus sensation, involuntary weight loss, and pneumonia), with scores ranging from 0 to 18. Lower scores represent milder symptoms.
Statistical analyses
The trial was planned to have 80% power to detect a difference of 10/100 in NDI score, considered to be the minimal level required for clinical important change [
27,
28]. On the basis of a significance level of 0.05 and a standard deviation of 18, 104 participants were required for the trial. Correcting for 40% lost to follow-up gave a total of 146 participants. A
P value of <0.05 was used as a level of significance. PASW (Predictive Analytics SoftWare) Statistics 18 (IBM Corporation, Armonk, New York, USA) was used for all analysis.
Outcomes were analyzed according to the intention-to-treat principle. Continuous data are described as means and standard deviations (SD), or medians and interquartile ranges (IQR), as appropriate, and were statistically tested between the groups with independent t test or Mann–Whitney U test depending on assumptions on statistical distribution. Ninety-five percent confidence intervals (CI) are specified in Figure illustrations and Tables for the outcome measures. Categorical data are described as number of patients and percentages, tested with χ
2 test or Fischer’s exact test, as appropriate.
To assess change in outcome from baseline to each follow-up time-point, paired samples t tests were used for parametric data, and Wilcoxon signed rank tests for non-parametric data.
The repeated measurements after intervention were analyzed using linear mixed models with a random intercept adjusted for baseline score. Follow-up time-points, treatment modality and baseline score were included as fixed main effects together with interaction terms between follow-up time-points and treatment modality. The mean differences between treatment modalities with 95% CI at each follow-up time-point were estimated using linear combinations of estimators. The linear mixed model analysis was not described in the original study protocol, but applied due to a difference in NDI scores between the treatment modalities at baseline. A sensitivity analysis including seven patients who were randomized and excluded from the trial was performed based on intention-to-treat principle with extreme values (best possible score) for all outcome measures.
Possible effect or difference between the five neurosurgical departments was also evaluated, but neither the statistical assessment nor the trial design indicated that any multicenter effect should be taken into account in our final statistical analysis.
Ethical considerations
The trial was approved by the Regional Committee for Medical and Health Research Ethics in Central Norway, and the Data Protection Official for Research. All enrolled patients gave their written informed consent. Participating senior surgeons at each hospital performed all operations. The accuracy of the study to the protocol was vouched for by all authors, and it was a unanimous agreement to submit the final manuscript for publication.
Discussion
We found excellent clinical results for both treatment modalities at 3 months, which were sustained at 2 years. There was no significant difference between arthroplasty and fusion at any of the follow-up times. However, statistical analyses using linear mixed models that adjust for baseline values, dropout and missing data showed a difference in self-rated neck disability and the numeric rating score for arm pain in favor of fusion after 2 years.
This is not consistent with most randomized controlled trials [
6‐
12], the recent study on available registry data by Staub and colleagues [
29], and three recent meta-analyses [
15‐
17] reporting clinical outcome in favor of arthroplasty.
The between-group difference in NDI score of 5.9%, shown in the present study is small and the statistical significance is weak, and the results must, therefore, be interpreted with caution. One might argue that the difference should not be considered clinically important, but there is no clear consensus-based agreement on how large the between-group difference should be [
30,
31]. There were 78.3% in the fusion group and 70.0% in the arthroplasty group reporting an NDI change of 10 or more from baseline to 2-year follow-up. Even though the difference was not statistically significant, the direction did not favor arthroplasty. There may be several reasons for the discrepancy compared with previous studies, such as different implant design, different study methods, different fusion technique, different lengths of follow-up, and the impact of funding by arthroplasty manufacturers.
Different arthroplasty designs have revealed different biomechanical performances for the treatment of single-level cervical disc disease [
32]. Arthroplasty devices are considered constrained in certain planes if they restrict motion to less than that seen physiologically. The usual designs are, however, “semiconstrained”, which allows for physiological movement, or “nonconstrained”, where there is no mechanical stop and extremes of motion are prevented by the perispinal soft tissue and inherent compression across the disc space [
33]. The nonconstrained device used in the present trial is comparable in this respect with the Bryan device (Medtronic Spine and Biologics) [
7,
8,
10] and the Porous Coated Motion (PCM) device (NuVasiveInc. San Diego, CA, USA) [
11]. The Prestige ST (Medtronic Sofamore Danek) [
6,
9] differs from the present study implant by its semiconstrained design, and by the implantation technique, where the device is fixed with screws to the vertebrae cranial and caudal to the disc space. In addition to different degree of constraint, implants may also differ in design of their articulating surfaces. The ball and socket design of the device used in the present trial has a different impact on range of motion (ROM) compared with the Bryan and PCM devices, and the adjacent level intradiscal pressure has been shown to differ according to implant design [
32].
The study methods of the present trial also differ from the previously mentioned studies where only two describe blinding of the participating patients [
7,
11]. However, Heller and colleagues [
7] could not continue blinding of patients after completion of the surgical procedure due to treatment with non-steroid anti-inflammatory medication (NSAID) in the arthroplasty group for two weeks after surgery. Phillips and colleagues [
11] blinded patients only until after the surgical procedure was completed. Blinding of the surgical team until after decompression of the compressed nerve root has rarely been included in previous study designs, but was conducted in the study by Skeppholm and colleagues [
14], consistent with the previous study methods. Strict study methods are probably important to avoid expectation bias in both patients and surgeons, and may have been a contributing factor to the discrepancy with previous trials.
Another aspect, which may influence the outcome, is the applied fusion technique. Stand-alone polyetheretherketone (PEEK) cage implant as used in the NORCAT differs from most other comparable trials, where allograft and anterior plating are most commonly used [
6‐
11]. The reported fusion rates between the two techniques after 2 years are, however, similar at 97.5% [
6], 94.3% [
7], and 92.1% [
11] for allograft with plating and 92% [
34] for stand-alone PEEK cage. Nemoto and colleagues [
34] recently assessed clinical outcome and complications regarding postoperative dysphagia between stand-alone cage implant versus cage and anterior plating in single-level cervical disc disease, and found no difference between the two surgical methods.
The length of follow-up may also have an impact on the clinical outcome, and longer observational period after surgery is often requested. Time is naturally highly relevant in relation to the impact of adjacent level disease [
35]. However, the present study results demonstrate that there is little change in clinical outcome from 3 months up to 2 years after surgery. A longer follow-up has probably little effect on clinical outcome related to the completed surgery, as recently demonstrated by Gornet et al. [
36].
Arthroplasty manufacturers are often represented as sponsors of large randomized, controlled trials, as was the case in the present study. Their role in relation to outcome is probably important to include in the overall discussion regarding outcome discrepancy between authors, and was recently discussed by Alvin and colleagues [
37]. They assessed whether trials funded by arthroplasty manufacturers had a greater likelihood of reporting results in favor of arthroplasty, and found lower complication rates when a conflict of interest was reported, but no impact on health-related quality of life outcomes.
Critical issues which may explain the discrepancy in clinical outcome between the present study and most previous comparable trials are difficult to point out. The truth, however, may be a combination of physiological and actual differences between the implants, as well as different study designs as discussed above.
The expected clinical outcome is important in the surgical decision-making for individual patients. In addition, differences between surgical techniques are also key factors to consider. In the present trial, patients operated with arthroplasty had significantly longer duration of surgery, which corresponds to the results from a newly published meta-analysis [
15]. Even though experienced spinal surgeons operated the patients, all surgeons were more familiar with the fusion procedure as it was the standard treatment in the departments involved. Thus, level of experience is one possible explanation for the difference in surgery duration. Other possible explanations are that implantation of the specific arthroplasty device is technically more demanding and time consuming. There were no severe complications in the present study, but the reoperation rate differed from previous trials reporting more secondary surgeries with fusion [
6,
8,
9]. The difference in index level reoperations could be explained by suboptimal implantation technique or incorrect size of the arthroplasty device. However, all patients who were reoperated had their primary surgery at a time-point when all surgeons had good experience with the particular arthroplasty device. In a recent study using the same implant [
38], instability and accompanying neck pain after arthroplasty were found in 8% of patients, all of whom underwent revision surgery.
Corresponding with previous reports [
6,
7], patients in the arthroplasty group returned to work two weeks earlier than patients in the fusion group, but there was no difference in employment status at 2-year follow-up. A previous study concluded that the duration of preoperative sick leave influenced return to work postoperatively [
39]. In the present trial, preoperative sick leave was 3 weeks shorter in the arthroplasty group, but the difference was not significant.
Ament and colleagues recently assessed the cost-effectiveness of 2-level arthroplasty or fusion at 2- and 5-years follow-up. Arthroplasty was more expensive than fusion, but came out with higher total quality adjusted life years, suggesting it to be a highly cost-effective treatment option [
40,
41]. Consistent with these results, Zou and colleagues recently presented a meta-analysis on clinical outcome after two-contiguous level cervical disc surgery and concluded that arthroplasty was equivalent, and in some aspects significantly superior to fusion regarding clinical outcome [
42]. Considering the results of the present trial, the growing interest among physicians for arthroplasty as an alternative to fusion, and the high number of surgical procedures performed each year [
43], future studies should focus on both clinical outcome as well as cost-effectiveness analyses.
The role of adjacent level disease was not addressed in the present study since clinical outcome was the only focus of this report. The impact of adjacent level disease will be presented in a forthcoming paper including the NORCAT 5-year follow-up data. Regarding maintenance of mobility, which is the main goal of choosing arthroplasty over fusion, the authors of the present study have recently shown that high-grade heterotopic ossification around the Discover arthroplasty device was found in 62% after 2 years [
44].
Limitations
Our study may be criticized for a too short follow-up period. However, the present study shows that there is little change in clinical outcome from 3 months up to 2 years. Similar results at even longer follow-up was recently presented by Staub and colleagues who reported quite stable postoperative course of patient-reported outcomes between 2 and 5 years both after arthroplasty and fusion based on registry data [
29]. Their results also strengthen the external validity of randomized controlled trials comparing cervical arthroplasty and fusion, where a large number of patients often do not meet the inclusion criteria, as was the case in the present trial.
Even though no patients with severe spondylosis should have been included in the NORCAT, the degree of spondylosis using radiographic parameters for evaluation could have been emphasized specifically in the inclusion/exclusion criteria. Therefore, one cannot exclude the possibility that some patients not meeting the criteria for arthroplasty may have been included, which again could have biased the study in favor of the fusion group.