Background
Huntington’s disease (HD) is a devastating disease that is inherited in an autosomal dominant manner. The genetic cause of the disease is a CAG repeat expansion in the coding region of the huntingtin gene (
HTT). This is translated to an expanded stretch of glutamine amino acids in the huntingtin protein (HTT) and this mutant protein is the main cause of neuropathology in HD. While extensive research has been done since 1993, when the genetic cause of the disease was discovered [
1], there is still no cure for this disease nor an effective treatment.
Clinical and imaging biomarkers have been developed that measure the disease state and progression [
2]. Nevertheless, these biomarkers can be expensive and often cannot monitor changes before onset of clinical symptoms. To develop an intervention that starts before disease onset it is important to have biomarkers that can accurately measure changes between controls and HD patients before symptoms first arise. To date, promising clinical trials targeting the mutant protein are under development and robust as well as reliable biomarkers are essential to advance these novel therapeutic strategies into the clinic.
Although the main pathology of HD is found in the brain, human brain tissue cannot be used to measure molecular biomarkers to monitor disease state and progression in living patients. However, due to the ubiquitous expression of the mutant protein, the HD phenotype is not limited to the brain. Symptoms such as weight loss, skeletal muscle wasting and cardiac failure, point out an altogether complex pathology that involves many tissues [
3,
4]. This opens the opportunity to investigate HD related pathology in more accessible tissues that can be obtained in a non-invasive manner. Transcriptional dysregulation is a prominent feature of the disease. Expression profiling studies in brain have shown that in the caudate nucleus 21 % (9763) of the probe sets demonstrated significant differential expression [
5]. Investigating gene expression changes in peripheral tissue can provide new insights that can lead to the development of new therapies and biomarkers to monitor disease progression.
Using post mortem brain tissue can however introduce biases when studying disease mechanisms due to non-disease specific effects of post mortem interval and specific agonal conditions such as coma, hypoxia and seizures [
6]. Several studies have focused on the analysis of blood using microarray technology, to study the pathology in HD. However, HD-specific gene expression changes are less pronounced in blood and it has proven difficult to validate them across studies [
7,
8]. For example, Borovecki et al. analyzed global changes in mRNA expression in the blood samples of HD patients, compared with controls and identified a set of 12 genes that were able to clearly distinguish controls and patients with HD [
9]. Although this work was highly promising, to this date their results have proven difficult to replicate.
More promising biomarkers emerged with the advances in next-generation sequencing. Mastrokolias et al. identified a HD signature that included 40 genes that were previously reported in at least one HD gene expression study with the same direction in expression change [
10].
However, Cai et al. [
11] showed that little preservation occurs in mean expression levels between the brain and blood. It is however possible that signals are preserved at levels beyond gene expression. For instance, Chuang et al. pointed out that subnetwork markers in a protein-network-based approach were significantly more reproducible than individual gene markers in two different cancer cohorts [
12].
The most robust HD disease signature based on transcriptomics data to be used for drug development and disease progression biomarkers should be present in both blood and brain [
13,
14]. Because the blood signature is derived from non-neuronal tissue and the brain signature is masked by non-HD related processes such as hypoxia, the shared signature is likely the most informative from a mechanistic and therapeutic point of view. We used a systems biology approach that combines Weighted Gene Co-expression Network Analysis (WGCNA) [
15,
16] and literature mining technology [
17,
18] to assess the similarity between brain and blood tissue using previously published gene expression studies in brain and blood [
5,
10]. We prioritized signatures that were shared between blood and brain tissue at a systems level, based on mechanisms that involve multiple genes and proteins. In general, genes that are part of the same mechanism, exhibit similar expression changes. At the mechanistic level we can compare signature signals from post-mortem HD brain tissue and blood to provide novel biomarkers that can be measured in blood to monitor the brain pathology in living patients. Such an approach offers many advantages and can also be useful for other neurodegenerative disorders. Apart from the non-invasive nature of blood sampling, it is also cost effective and widely available. This can lead to the development of more standardized tests and offer more robust measurements.
Discussion
In this paper we used WGCNA and literature information to identify modules in blood that are associated with the HD phenotype and to identify disease signatures that are shared between blood and brain. To our knowledge, this is the first time that the similarity between blood and brain tissue was successfully assessed based on a combination of WGCNA and literature information. WGCNA was used in order to group genes of the same tissue that are co-expressed (modules), while literature information (CPA) was used to annotate and evaluate the similarity between modules from different tissues at a functional level.
In summary, we identified 8 HD-specific modules in blood and two distinct signatures that are shared between blood and brain. The HD-specific modules in blood were associated with immune response, sphingolipid biosynthetic process, lipid transport, cell cycle, protein modification, spliceosome, RNA splicing, vesicle transport, cell signaling and synaptic transmission. This analysis points to mechanisms that are affected in HD. Some were already known to be implicated in the brain pathology, but their role in blood has not been elucidated yet [
31,
32,
42,
43].
The scarcity of HD brain tissue has driven research to use blood to identify biomarkers that can be used to study disease state and disease progression that are most clearly observed in the brain. In previous studies the similarity between blood and brain was assessed based on the conservation of gene expression patterns only [
8‐
10]. Such assessments are usually very difficult, because blood and brain are two inherently different tissues composed of very different cell types. Nevertheless, in our study, we discovered signatures that are based on a functional similarity between blood and brain. We argue that the same function may partially be executed by different gene products in different cell types, also considering that our current knowledge may have different gaps across cell types. At the functional level the active units are not merely the genes, but cells and organs. Cells of different types that play a role in the execution of a biological function express different genes that are thus associated with that function. It is therefore a fair assumption that when comparing between cell types, we should look beyond the level of individual genes. For instance, microglia and macrophages both participate actively in the immune response but different sets of genes are expressed in microglia or macrophages [
44]. Another example where two different gene products carry out the same function is hemoglobin and myoglobin. Both genes are associated with transport of oxygen, but one is expressed in red blood cells and the other in muscle [
45‐
47]. Therefore, signatures based on co-expression and functional annotation are more likely to represent disease-specific mechanisms. We speculate that a common signature at the functional level is more robust, which makes it attractive to monitor disease progression or the efficacy of a particular treatment. The blood-brain signature allows us to focus on a specific part of the blood signal to monitor the HD-affected brain.
Our findings suggest that mechanisms associated with inflammatory response and SCA are important mechanisms that are shared between blood and brain in HD (Fig.
4). The inflammation response may be an important component of HD pathology that contributes to the neuropathological damage. This finding supports previous detection of abnormal activation of immune response in HD patients [
31]. In addition, this signature links the well-established neuroinflammation signature in brain to a parallel inflammatory response in blood, triggered upon expression of the mutant huntingtin. The same signature was also identified by Horvath and colleagues in their study of the preservation of brain modules in two large blood cohorts of healthy individuals [
11]. Although they were unable to identify full module preservation, they found that a subset of the genes that was preserved was functionally enriched in, among others, infectious disease and infection mechanisms. Both analyses point to an immune response mechanism as a shared channel between blood and brain. Although the mechanisms preserved in those datasets came only from healthy individuals, we conclude that this preserved signal is also specific for HD.
In addition, we showed that blood exhibits similarities with brain based on different criteria (Fig.
3). These criteria reflect similarities on a functional level i.e. biological processes, cellular component, molecular function, and functions associated with the same disease or syndrome. The disease or syndrome annotation led us to the identification of the SCA signature (Fig.
4
b). The association of blood modules with brain disorders is by itself an interesting topic for further research. A signature based on commonality in disease or syndrome annotations would have been difficult to identify by approaches that only focus on gene expression or traditional annotation schemes (e.g. GO based annotation). Genes that were part of this signature on the brain side were the
TCF4, ATN1, PPP2R2B, ATXN10 and
ATXN3, which are associated with neurodegenerative and developmental disorders such as Pitt-Hopkins Syndrome [
48], Dentatorubral pallidoluysian atrophy [
49], SCA12 [
50], SCA10 [
51] and SCA3 [
52]. On the blood side of this signature were among others the
PPP2R2B gene associated with SCA12,
CAPNS2, which was recently found to play a beneficial role against polyglutamine toxicity [
53], and
SETBP1 which is associated with the Schinzel-Giedion syndrome [
54,
55].
The comparison of the results of our methodology with those obtained by the preservation statistics from WGCNA, and those from assessing the gene overlap between blood and brain, shows that the three methodologies are complementary. The overlap between the three methods was very small (Fig.
5) indicating that they identify similarities based on different criteria. In fact, the signature that was identified by all methods was not associated with the HD phenotype, but with sexual differentiation. The modules in this signature are strongly co-expressed and composed mainly of genes expressed on the X and Y chromosome. The identification of this signature served also as a control for testing the validity of each method. Depending on the hypothesis that drives one’s analysis, one or a combination of these methods could be used. Our method has the advantage that for identifying similarity we solely look at similarity at a functional level, i.e. without bias in terms of overlapping genes or similarity in expression patterns between the two tissues.
Additional disease relevant signatures
In addition to the aforementioned signatures that were selected by the most strict criteria, we identified two additional module pairs based on less stringent criteria that we considered interesting for HD. The first criterion that we used in this analysis was that a module pair needed to achieve a significance level of 10 % of FWER. As explained in the
Methods section, we define a “gray” significance level of up to 50 %, based on the significance levels that we observed for the sexual differentiation modules that served as internal control. The second criterion was that each module in particular needed to be significantly correlated (
P
v
a
l
u
e<0.05) with at least one of the disease phenotypes. Detailed information about these common signatures can be found in Additional files
9 and
10.
The synapse signature was shared between blood and the caudate nucleus based on the cellular component annotation category (Additional file
2A). The caudate nucleus module (indicated with a dashed circle) was marginally associated with the disease staging phenotype (
P
v
a
l
u
e=0.074; Additional file
1) while the blood module was associated with motor and TFC score. Common annotations in this signature were associated with synapse activity and dendrites. Albeit the absence of cells with synapses in the blood, the annotation of the blood module with concepts like “synapse” hints at the presence of gene products with a function in the neuronal synapse and an alternative function in the blood, which may be used as a surrogate marker for the effect of disease on neuronal transmission.
The genes that contribute the most to the synaptic activity annotation are
NOS1, HAP1, GRIK2, HTR6. Literature supports that at least three of them are directly implicated in HD [
36,
56,
57]. In fact, HAP1 was one of the first proteins that were described to interact with huntingtin [
36]. The genes encode ubiquitously expressed proteins with a known function in brain, but their role in blood remains elusive.
GRIK2 was proposed as a candidate biomarker in Chronic Fatigue Syndrome (CFS) after it was detected in the peripheral blood of CFS patients as differentially expressed [
58].
NOS1 was also associated in studies as being able to regulate blood pressure [
59].
In addition, 7 out of 18 members of this module (MTRNR2L1, LAPTM5, QRICH1, MOAP1, AKR1B1, NOS1, ZNF260) were found to be indirectly associated with huntingtin through the interaction with the UBC protein that is known to interact with huntingtin [
60‐
65].
Finally, cell-cell signaling (NOS1, HAP1, GRIK2, HTR6), ion transport (HAP1, UNC80, KCNAB1), cell death, and apoptosis (LAPTM5, MTRNR2L1, MOAP1, QRICH1) were secondary annotations associated with this blood module. The majority of the genes in this module encode ubiquitously expressed proteins that are likely to have a catalytic role in the HD blood, similar to their effect in brain. The synaptic signature can potentially be of great value for monitoring synaptic activity in brain by monitoring these genes in HD blood.
The vesicle trafficking and protein transport signature was shared between blood and BA4 based on the cellular component annotation category (Additional file
2B). In this signature, the blood module was marginally associated with the CAG repeat phenotype and also this module pair was identified with a significance level of 50 % of FWER. The annotations that were shared in this signature were related to endosomes, trans-golgi network and clathrins. This signature was also identified by Horvath et al. as a preserved mechanism between blood and brain in healthy individuals [
11].
Both the synaptic activity and vesicle transport signatures have been long implicated in HD. Huntingtin is expressed in the cytoplasm where it directly interacts with a number of proteins involved in synaptic activity and vesicle transport [
66]. In addition, huntingtin has been previously described as a protein that acts as a mediator in information trafficking between different cell compartments by interacting with other proteins [
67]. Recent evidence suggests that synapse loss and other features of the disease that involve the CNS can be treated by targeting organs outside the CNS [
68]. The blood modules involved in these signatures that link the blood with the brain pathology could become subject for further research to confirm whether symptoms of the disease can be treated by targeting factors that associate with these modules [
68].
Although the blood-brain signatures that we identified are promising, there are certain limitations in our methodology that can be improved in follow up studies. Considering that similarity was assessed by overlap in annotations, future studies can extend the power of the method by using the hierarchy of an ontology to assign a score to annotations that are subclasses of the same function. Furthermore, the results from the computational analysis can be corroborated by further validation on additional data and by new experiments in the laboratory. The genetic predictability of HD allows for testing those signatures in carriers of the gene mutation both in mouse and humans, even before the first symptoms arise. We are currently investigating the analysis of data from human blood samples that were collected from the same subjects, but 4 years later as an accurate way of determining whether these signatures have changed over the time and whether they correlate with the progression of the disease. Testing these signatures in mouse models of HD to follow the efficiency of novel disease treatment strategies would also be beneficial for using and optimizing blood as a diagnostic and monitoring tool.
Acknowledgments
We gratefully acknowledg the fundings from the European Commission (FP-7 project RD-Connect, grant 584 agreement No. 305444), the European Community’s Seventh Framework Programme (FP7/2007-2013) [grant no. 2012-305121] ‘Integrated European -omics research project for diagnosis and therapy in rare neuromuscular and neurodegenerative diseases (NEUROMICS)’ and the IMI-JU project Open PHACTS (grant agreement No.11519). We would also like to thank Frederic Parmentier and François-Xavier Lejeune from the INSERM institute in Paris for the fruitful discussions and the Biosemantics group in Leiden University Medical Center in the Netherlands for their technical expertise. This work was carried out on the Dutch national e-infrastructure with the support of SURF Cooperative.