Introduction

Neuropathic pain-like joint symptoms (NP) have been reported in people with osteoarthritis (OA) of the knee or hip and in some people who have undergone total-joint replacement (TJR) for OA.1, 2 Estimates for NP post-TJR range from 1% to 63% in the literature depending on the methodology.2, 3, 4

Neuropathic pain is defined as ‘pain arising as a direct consequence of a lesion or disease affecting the somatosensory system’, adapted from the International Association for the Study of Pain (IASP) definition.5 Symptoms can include burning, hypersensitivity, prickling and numbness in both the affected areas and areas of the body distant from the site of damage.6 Treatments for NP have been reported to be of limited effectiveness for many individuals and the condition can have a large impact on quality of life.7, 8 There are numerous risk factors for NP identified in the literature such as nerve damage from surgery, chronic nociceptive input (as seen in chronic pain), complications from herpes zoster infection and diabetes.9, 10 There are some common risk factors for OA pain and NP such as age, past joint surgery and psychological factors.1, 7, 11

Heritability of NP has been estimated at 37% in the single published twin study on NP in humans.12 This is within the range of heritability estimates for other painful conditions such as back pain, migraine and sciatica that range from 21 to 58%.13, 14, 15, 16, 17

There have been numerous candidate gene studies on pain, including chronic pain post surgery.18 Genes reported in the literature on NP from candidate gene studies include the COMT gene, TRPV1 gene, P2X receptor genes and the CACNG2 gene.19, 20, 21, 22 The genetics of NP are still not fully understood.23 NP is thought to have distinct genetic mechanisms, and different types of hypersensitivity (eg, to heat or mechanical stimuli) and, according to mouse studies, different molecular mechanisms may be involved depending on the method for inducing NP.23

A genome-wide association scan (GWAS) can be used to study the genetic basis of complex traits so is an appropriate design to study NP which can have a complex aetiology. GWAS identifies the genetic locations (single-nucleotide polymorphisms (SNPs)) that differ significantly between cases and controls for a specific phenotype. The genes in which these loci are located offer clues about the mechanisms behind the phenotype.

To date only one GWAS has been published on NP, in individuals with diabetic neuropathy. Results from this GWAS identified SNPs in the GFRA2 and ZSCAN20 genes.24, 25 Zinc finger proteins are potentially relevant in the treatment of NP.26 Previous GWAS for migraine and chronic widespread pain (CWP) have identified susceptibility loci relating to genes involved in synaptic plasticity and some types of neuropathy, respectively.27, 28 A GWAS has also been published on acute post-surgical pain.29

The aim of this study was to identify genes associated with the risk of NP in individuals post TJR using a genome-wide approach. The replication analysis aimed to reproduce these findings in other groups containing individuals with knee and hip OA and knee pain.

Methods

The study design is outlined in Figure 1.

Figure 1
figure 1

Study design.

Participants

Nottingham discovery cohort

Participants were recruited post-total hip or knee replacement for OA (n=613) from secondary care in the Nottinghamshire area.

Nottingham replication cohort

Participants from an independent Nottingham-based study (n=908) including individuals with knee OA, hip OA, or both and individuals post-total hip or knee replacement were used as a replication cohort.

The North Nottinghamshire Research Ethics Committee gave approval for the ethics of both studies. All participants gave written, informed consent.

To improve statistical power, in each of the above two Nottingham groups, total hip replacement participants and total knee replacement participants were combined into one post-TJR group, as seen in previous GWAS analyses.

The Rotterdam study

The selected individuals were part of Rotterdam Study III (RS-III) that was started in 2006 and comprised of in total 3932 participants. A total of 212 women that reported knee pain had data on painDETECT and genetic data. This population-based cohort study has been previously described and is studied in the context of chronic disabling diseases in older adults.30 The Erasmus University Medical School medical ethics committee gave approval for this study. All participants gave written, informed consent.

Stage 1: GWAS

Blood samples from the participants in this study were processed to obtain genotype data. Genotype data were analysed using the Illumina 610 k array (https://www.ebi.ac.uk/ega/studies/EGAS00001001017). Only directly typed SNPs were used. Genotyping and QC were carried out as previously described.31 gPLINK software (version 1.07) was used to analyse GWAS data from this array.32 The results of this association are a list of genetic variants (SNPs) and information about their location in the genome, as well as an odds ratio (OR), chi-square value and P-value to indicate the level of association of the variants with the specified phenotype.

The results of this GWAS have been submitted to GWAS Central and can be accessed at: http://www.gwascentral.org/study/HGVST1846.

The statistics program R (version 3.0.2) was used to create Manhattan and QQ plots using the ‘ggplot2’ library and ‘qqplot’ script.

Post-genomic analysis was undertaken using the Database for Annotation, Visualisation and Integrated Discovery (DAVID).33 This is an online tool to which a list of genes can be submitted and subsequently results are generated regarding the genes’ involvement in biological processes.33 The gene list was comprised of genes corresponding to all SNPs with a P-value of P<0.0001 in the GWAS analysis. The BioCarta and Kegg pathways maps were used for functional annotation.

Stage 2: replication cohorts

The top four SNPs with a nominal P-value of P≤5 × 10−6 after the stage 1 GWAS analysis and two additional lower ranking but potentially biologically relevant SNPs were selected for replication (see Results). Genotype information for these SNPs from in silico and de novo genotype data were used for further analysis. In total, six SNPs were selected for replication analysis.

Stage 3: meta-analysis

The ‘meta’ library in the statistics program R (version 3.0.2) was used to run the meta-analysis using the three cohorts described above. Meta-analysis takes the effect size, standard error and sample size into account to give an overall effect from the different groups studied. If heterogeneity was significant between the cohorts in the meta-analysis, a Han Eskin random effects model was used as an alternative meta-analysis method as, compared with traditional models, it allows for more heterogeneity in the data.34

Phenotype

Individuals were assigned a phenotype by classifying them according to their scores on the painDETECT questionnaire. This is a seven-item questionnaire scored from 0 to 39 that uses a Likert scale for participants to describe the nature of their pain, in order to distinguish it from nociceptive pain. Questions are included on qualities such as burning pain, tingling, sudden pain and sensitivity to heat and cold. In all cohorts, scores of >12 were classified as ‘possible neuropathic pain’ according to the validated cut-offs for diagnosis by Freynhagen et al.35

This cut-off and not the stricter one of ≥19 for ‘likely NP’ was used to increase statistical power given the larger number of cases available and the similar clinical characteristics, particularly the use of anti-neuropathic medication in our data (possible NP 38.5% opioids, 11% antineuropathics, likely NP 37% opioids, 11% antineuropathics). Using this definition, we therefore had n=109 possible NP cases and n=504 controls. This gave a statistical power of 42% to detect an association with an OR of 2.1, and 97% power to detect an association with an OR of 2.9 with P<5 × 10−7 for a minor allele frequency (MAF) of 35%.

Results

Stage 1: GWAS

The results of the unadjusted GWAS on NP can be seen in Table 1, and Figures 2 and 3. A total of 548 381 SNPs were tested for association with NP. The genomic control inflation factor for the P-values was low (λ=0.99) and the quantile–quantile (QQ) plot indicated no substantial population stratification due to cryptic relatedness, population substructure or other biases (Figure 2).

Table 1 The results of interest from the unadjusted Illumina array NP GWAS, followed by the results of replication analysis and meta-analysis
Figure 2
figure 2

QQ plot for the results of the GWAS (λ=0.99).

Figure 3
figure 3

Manhattan plot showing the P-value of association tests for SNPs with possible NP in the Illumina array GWAS. P-values represent the association of the SNPs with possible NP.

The results of the GWAS are summarised in Manhattan plots of the P-values (Figure 3). Table 1 shows the OR and significance of the results from the Illumina array NP GWAS for four of the top-scoring SNPs and two SNPs of biological relevance. The top four SNPs with P≤5 × 10−6 were selected for further replication, as were two SNPs with higher P-values but in potential candidate genes, rs4866176 mapping to the brain-specific cadherin CDH18 gene and rs1133076 mapping to the thyroglobulin TG gene.

Pathway analysis

Pathway analysis was carried out on the GWAS results using a list of genes corresponding to SNPs with P<0.0001 in the GWAS (n=62; see Supplementary Table S1, Supplements). If the SNP mapped to an area within a gene, this gene was used. For intergenic SNPs, the two closest flanking genes on each side were used. The results of this analysis report no significant findings after adjusting for multiple testing with a Bonferroni correction (see Supplementary Table S2, Supplements).

Stage 2: replication cohorts

We sought to replicate the six selected SNPs for their association with NP in two independent replication cohorts. The results are shown in Table 1. As shown in Table 1, two of the SNPs selected from the GWAS in stage 1 for replication analysis show nominally significant P-values and effects in the same direction in one of the replication cohorts.

Stage 3: meta-analysis

We then combined discovery and replication results in a joint meta-analysis. The results can be seen in Table 1. Heterogeneity of the loci was tested using the Cochran Q test.

Due to the significant heterogeneity introduced to the model by the replication data in the rs887797, rs4866176, rs7734804, rs298235 and rs12596162 meta-analyses, a Han Eskin random effects model was used to account for this (Table 1). The additive model for the rs887797 SNP after this analysis gave a result of: OR=1.48 (95% CI 1.23–1.75), P=1.65 × 10−5. A recessive model for the rs887797 SNP was also used in a meta-analysis. A recessive model was used to test the nature of the effect of the risk allele, that is, to test if two copies of the risk allele were needed to increase the risk of possible NP. After Han Eskin analysis, the recessive model for rs887797 gave a result of OR=2.41 (95% CI 1.74–3.34, P=1.29 × 10−7; Figure 4).

Figure 4
figure 4

Forest plot showing the results of an unadjusted Han Eskin analysis of the rs887797 SNP using a recessive model.

After adjusting for age, sex and BMI, Han Eskin analysis of the rs887797 SNP gave values of: ORpossNP=1.44 (95% CI 1.21–1.73, P=7.13 × 10−5) and ORpossNP=2.33 (95% CI 1.67–3.27, P=8.67 × 10−7) for the additive and recessive models, respectively. Upon combining the data from the two replication cohorts used, it was found that overall this SNP was nominally significant. The additive model for the rs887797 SNP in the combined Nottingham replication cohort and Rotterdam Study cohort gave OR=1.25 (95% CI 1.01–1.55), P=0.040 (not significant after adjusting for seven tests) and the recessive model gave OR=1.75 (95% CI 1.15–2.64), P=0.0076 (Bonferroni P-value P<0.053).

Finally, we attempted to replicate two of the top hits from the only published GWAS on NP. These SNPs were reported to be suggestively associated with diabetic neuropathy: rs17428041 (GFRA2, OR=0.67, P=1.77 × 10−7)24 and rs71647933 (ZSCAN20, OR=2.31, P=4.88 × 10−7).25 The effect of rs17428041 was not replicated in the results of our GWAS: OR=1.47, P=0.016. Similarly, after using a proxy for rs71647933 (rs12565140, r2=0.947) we found no association with NP in the results of our GWAS: OR=0.71 (95% CI 0.46–1.09, P=0.12).

Discussion

We report a suggestive association between a variant in the PRKCA gene and NP in people with knee pain, knee or hip OA and post TJR. The findings are biologically plausible and supported by previously published work in the literature. We were unable to confirm the recently published association between SNPs in the GFRA2 and ZSCAN20 genes and diabetic neuropathy.24, 25 However, it should be noted that diabetic neuropathy is not necessarily the same phenotype as neuropathic pain-like joint symptoms. The definition of NP used in these studies was partly based on use of prescription analgesic medication and partly on the results of sensory testing. However, this type of medication is commonly used even by people with no NP, including people post TJR with no NP. In our study, a validated screening questionnaire (painDETECT) was used, the location of pain is exclusively that of the OA-affected joint and further clinical history and demographics have been collected for all participants.

The top hit from our analysis maps to the PRKCA gene. This gene codes for protein kinase C alpha, a protein that has been linked with the nervous system and may contribute to central sensitisation in dorsal horn neurons.36 The PRKCA gene has also been found in the literature to be involved in long-term potentiation (LTP), a process involved in both memory and chronic pain.37 As well as this, the PRKCA gene has been implicated in related processes such as memory capacity and post-traumatic stress disorder (PTSD)38 and genetic variation in this gene has been linked to the neural basis of episodic memory.39 Although we do not reach the P<5 × 10−8 threshold for GWS, we show a plausible effect on NP post TJR.40 A role for the PRKCA gene in pain has been previously reported.41 The rs887797 variant identified in this paper is a variant already associated with multiple sclerosis.42 Therefore, although this association may not reach GWS it remains biologically plausible, although further work is needed to confirm its role on NP.

In the present GWAS, the intergenic rs12596162 SNP near the FOXL1 gene was associated with NP: OR=2.05 (95% CI 1.51–2.79), P=3.53 × 10−6. This gene codes for a forkhead/winged helix-box transcription factor.43 This gene and the rest of the FOX gene family are involved in many cellular processes.43 FOXL1 in particular was found in one study to be involved in the Wnt/β-catenin pathway44 that is important in the nervous system and has been implicated in NP and hip OA.45, 46

Thyroglobulin, encoded by the TG gene, is a protein necessary for normal thyroid function that has previously been related to NP and central sensitisation in the literature.47 The rs1133076 SNP mapping to this gene was suggested in this analysis to be associated with possible NP at the discovery stage with P=3.41 × 10−4. However this variant did not replicate in the additional cohorts and the evidence for association with NP for this gene is very weak.

The effect sizes we report here are larger than those reported in previous GWAS on pain traits such as migraine and CWP (OR=1.18 and OR=1.23, respectively).27, 28 The effect size for the GWAS on NP in diabetes was 2.31 for the SNP with the lowest P-value, which is consistent with our finding for the rs887797 SNP in the GWAS analysis (OR=2.00, see Table 1).

There are a number of limitations to this study. None of the variants identified by this study reaches GWS. This is not surprising given the small discovery and replication sample sizes available for this kind of study. A major issue with the use of GWAS is the potential for inflated associations.48 The statistical power for the rs887797 recessive model with the observed OR=2.41 was 56% for GWS. For the observed P-value, the statistical power was 66% given the observed minor allele frequency and the rare homozygote frequency (which is in HWE). Although the study was underpowered for GWS, the effect size is relatively large. To achieve 80% power with this effect size and the same proportion of cases to controls we would have needed 417 cases and 1 767 controls, a 25% larger sample size, assuming that in the additional sample the effect was the same.48 Due to the ‘winner’s curse’49 the effect size reported here is likely to be an overestimate given the small sample size used for the discovery phase, and sample sizes of at least twice those that were used are likely to be needed. Furthermore, heterogeneity between the groups used in the meta-analysis can limit the effects seen in the results though we attempted to address this by the use of a Han Eskin Random Effects analysis.34

The absence of a clinical NP diagnosis in these participants is another limitation of this study. However, the results of this questionnaire have been shown to correlate with brain activity in areas associated with NP in people with NP and OA.50

In summary, we report a biologically plausible genetic effects on possible NP in individuals with knee pain, OA and post TJR. Replication in further cohorts could improve sample size and P-values and it is hoped that this GWAS of neuropathic pain-like symptoms of the joint may encourage the collection of DNA and of painDETECT and similar instruments in other cohorts.