Background
TMEM106B is a lysosomal protein and belongs to the TMEM106 family of proteins with relatively unknown function [
1,
2]. In 2010, a common
TMEM106B genetic variant rs1990622 (T>C) was first identified to be associated with frontotemporal dementia (FTD) risk (OR = 1.64 for T allele, allele frequency = 0.679, and
P = 1.08E−11) [
3]. Since the identification of rs1990622 as an FTD risk variant, kinds of studies have been conducted to understand the role of this non-coding mutation, which is located downstream 6.9 kb 3′ of
TMEM106B [
4‐
6].
Li and colleagues conducted a cell type quantitative trait loci (cQTL) analysis using 2008 brain samples derived from 1536 unique individuals including 640 AD, 488 cognitively healthy controls, 11 FTD, 75 progressive supranuclear palsy, 28 pathological aging, 189 schizophrenia, 30 bipolar disorders, and 75 individuals with other unknown dementia or no diagnosis information [
4]. Interestingly, Li and colleagues identified
TMEM106B variant rs1990621 C allele, which is in high linkage disequilibrium with rs1990622 variant T allele, to be significantly associated with the reduced neuronal proportion [
4].
Until recently, Yang and colleagues conducted a module quantitative trait loci (modQTL) analysis to identify genetic variants regulating the average expression of the genes found in the gene co-expression modules [
5]. Interestingly, they found rs1990622 variant to show a significant modQTL effect, and highlighted
TMEM106B as key aging human brain transcriptome regulator [
5]. Meanwhile, Yang and colleagues identified that myelination and lysosomal genes regulated by
TMEM106B could connect amyloid-β (Aβ) and TAR DNA-binding protein 43 kDa (TDP-43) [
5]. It is known that increased Aβ is a key Alzheimer’s disease (AD) neuropathology. Hence, Yang and colleagues provided important findings about the key pathogenic link between AD and TDP-43 proteinopathy [
5]. However, Yang and colleagues did not directly evaluate the association between rs1990622 variant and AD risk. Until now, it remains unclear whether rs1990622 variant is associated with AD risk, although a lack of significant association between rs1990622 variant and AD risk [
7]. We think that this may be caused by inadequate sample sizes (300 AD cases and 137 controls) [
7], and large-scale samples are needed.
Meanwhile, Yang and colleagues conducted an expression quantitative trait loci (eQTLs) analysis of rs1990622 variant using 494 human prefrontal cortex samples from the Religious Orders Study and Memory and Aging Project (ROSMAP) [
8]. They found that rs1990622 variant T allele could significantly increase
TMEM106B expression (β = 0.067, and
P = 5.90E−05) [
5]. However, gene expression analysis did not support the increased
TMEM106B expression in human brain tissues. Satoh et al. evaluated the expression levels of TMEM106B in AD and control frontal cortex and the hippocampus tissues [
9]. They selected 6 sporadic AD patients and 13 controls including 4 normal subjects without neurological disease, 3 patients with sporadic Parkinson’s disease, 4 patients with sporadic amyotrophic lateral sclerosis, and 2 patients with sporadic multiple system atrophy [
9]. They demonstrated that both the mRNA and protein levels of TMEM106B were significantly reduced in AD brains compared control brains [
9].
In discussion, Yang and colleagues concluded that the pre-existing neurodegenerative proteinopathies were not necessary to drive the association between rs1990622 variant and
TMEM106B transcriptome dysregulation [
5]. However, recent findings from other
TMEM106B variants did not support this conclusion. Ren and colleagues conducted a stratification analysis and highlighted more pronounced effects of
TMEM106B rs3173615 variant on the transcriptome in neurodegenerative diseases than in healthy controls [
6]. Li and colleagues conducted a stratification analysis and found that
TMEM106B rs1990621 variant could regulate the neuronal proportion in AD cases, other neurodegenerative diseases, elderly cognitively healthy controls, but not young controls [
4]. All these findings indicated that the link between
TMEM106B haplotype and transcriptome dysregulation is context dependent [
4,
6]. Importantly, rs1990622 variant is in high linkage disequilibrium with rs3173615 (
r2 = 0.98 and
D′ = 1) and rs1990621 (
r2 = 0.99 and
D′ = 1). Hence, we consider that the association between rs1990622 variant and
TMEM106B transcriptome dysregulation may also be context dependent.
Until now, large-scale AD genome-wide association study (GWAS) datasets and large-scale eQTLs datasets in both the neuropathologically normal individuals and neurological disease individuals have provided strong support to answer these concerns [
10,
11]. Here, we conducted comprehensive analyses using publicly available datasets. In stage 1, we conducted a genetic association analysis to investigate the effect of rs1990622 variant on AD risk using multiple large-scale GWAS datasets. In stage 2, we performed a gene expression analysis of
TMEM106B in 49 different human tissues. In stage 3, we performed an eQTLs analysis to evaluate the effect of rs1990622 variant on
TMEM106B expression in multiple human brain tissues with different disease statuses. In stage 4, we performed a colocalization analysis to provide evidence of the AD GWAS and eQTLs pair influencing both AD and the
TMEM106B expression at a particular region.
Discussion
It has been well established that the
TMEM106B rs1990622 variant was a FTD risk factor [
30,
31]. Until recently, growing evidence highlights the role of
TMEM106B in other neurological processes including hippocampal sclerosis of aging [
32], neuronal loss [
31], cognitive deficits [
31], better residual cognition [
30], AD [
9,
33], Parkinson’s disease [
34], and amyotrophic lateral sclerosis [
34]. Importantly, Yang and colleagues identified the
TMEM106B rs1990622 variant to show a significant modQTL effect, and highlighted the converging effects of
APOE-Aβ and
TMEM106B [
5]. However, it remains largely unclear about the role of rs1990622 variant in AD. Here, we conducted comprehensive analyses including genetic association study, gene expression analysis, eQTLs analysis, and colocalization analysis.
Using the genetic association analysis, we evaluated the association of rs1990622 variant with AD using two independent large-scale GWAS datasets from IGAP and UK Biobank, and then conducted a meta-analysis [
10,
11]. Interestingly, the results are consistent in both IGAP and UK Biobank, which indicated that rs1990622 was significantly associated with AD risk in both datasets (Table
2). A sex-specific analysis in UK Biobank further indicated that rs1990622 T allele only contributed to increased AD risk in females, but not in males (Table
2). Tropea and colleagues have also evaluated the association of rs1990622 variant with AD using 300 AD cases and 137 neurologically normal control subjects [
7]. However, Tropea and colleagues did not identify any significant association of rs1990622 with AD [
7]. We think that this may be caused by inadequate sample sizes.
It is known that rs1990622 variant is a non-coding mutation. Hence, it remains unclear how rs1990622 variant affects AD risk. eQTLs analysis is an important method to evaluate the roles of non-coding genetic variants especially the GTEx project, which established a data resource and tissue bank to study the relationship between genetic variation and gene expression in multiple human tissues [
35]. To explore the effect of rs1990622 variant in AD risk by regulating
TMEM106B expression, eQTLs analysis should be conducted in neuropathologically normal individuals or in a general population based on the three considerations. First, eQTLs analysis in AD patients could not be interpreted in terms of AD risk or susceptibility as lack of healthy controls or general individuals [
36]. Second, it is well known that disease statuses could change the expression of a specific gene. Hence, gene expression analysis often indicated dysregulated genes in cases compared with controls [
37]. Take
TREM2 for example, its expression is upregulated in multiple pathological conditions such as Parkinson’s disease, amyotrophic lateral sclerosis, stroke, traumatic brain injury, and AD, compared with normal controls [
37]. Until now, most eQTLs studies focusing genetic variants associated with neurological diseases were conducted in neuropathologically normal individuals, such as AD (UKBEC [
38], and GSE15745 [
38], and 128 normal subjects [
39]), progressive supranuclear palsy (387 normal subjects) [
40], schizophrenia (UKBEC [
41], GTEx [
41], 128 normal subjects [
39], and 120 normal subjects [
42]), Parkinson’s disease (128 normal subjects [
39], GTEx [
43]), and bipolar disorder (120 normal subjects [
42], GTEx [
44], and UKBEC [
44]). Meanwhile, other eQTLs studies using both AD and other controls were also reported by adjusting for disease status and some critical covariates [
45,
46]. Third, Nicholson and colleagues have reviewed recent findings and found that the significant association between rs1990622 and
TMEM106B mRNA expression identified in lymphoblast cells could not be successfully replicated in postmortem brain tissues [
2]. It is possible that the variable levels of neuronal loss and cell type composition may have masked the association between rs1990622 and
TMEM106B mRNA expression [
2]. Hence, Nicholson and Rademakers suggested that eQTLs studies might be best conducted in non-diseased tissues [
2].
Considering these above findings, we then performed an eQTLs analysis to evaluate the effect of rs1990622 variant on
TMEM106B expression in multiple human brain tissues from neuropathologically normal individuals (UKBEC and GTEx), and further compared the findings from neurological disease individuals (Mayo and ROSMAP). In UKBEC, we found no significant association between rs1990622 variant and
TMEM106B expression in 10 brain regions, as provided in Table
3. In GTEx, we found that rs1990622 T allele could significantly reduce
TMEM106B expression in cerebellum (
P = 1.90E−06), cortex (
P = 2.20E−05), and cerebellar hemisphere (
P = 1.50E−03). In Mayo, we found no significant association of rs1990622 with
TMEM106B expression in cerebellum and temporal cortex. In ROSMAP, Yang and colleagues found that the rs1990622 T allele could increase
TMEM106B expression in human prefrontal cortex [
5]. In summary, rs1990622 variant showed different association with
TMEM106B expression in neuropathologically normal individuals and neurological disease individuals (Mayo and ROSMAP). The differences were even observed across the neuropathologically normal individuals, such as UKBEC and GTEx, and across the neurological disease individuals, such as Mayo and ROSMAP.
We consider that four reasons may contribute to explain these differences. First, disease statuses may have caused these differences. Satoh and colleagues found that both the mRNA and protein levels of TMEM106B were significantly reduced in AD brains compared control brains [
9]. Hence, the different expression of
TMEM106B may further cause different eQTLs findings in AD and controls. Importantly, eQTLs or cQTL analysis using other
TMEM106B variants including rs3173615 and rs1990621 further supported our findings [
4,
6]. Meanwhile, our and other studies have clearly indicated that eQTLs could vary considerably in different disease statuses [
16‐
19,
47‐
52]. Second, the mean ages at death in different eQTLs datasets may have driven these differences. The mean ages at death were 55 (UKBEC), 59 (GTEx), 72 or 74 (Mayo), and 88 (ROSMAP), respectively, as provided in Table
1. Nicholson and colleagues explained that the variable levels of neuronal loss and cell type composition may have masked the association between rs1990622 and
TMEM106B mRNA expression in the older population [
2]. Third, the percents of females in different eQTLs datasets may have driven these differences. The percents of females were 26% (UKBEC), 33% (donors), 36–53% (Mayo), and 62% (ROSMAP), respectively, as provided in Table
1. This explanation was supported by our genetic association findings that rs1990622 T allele only contributed to increased AD risk in females, but not in males (Table
2). Importantly, recent findings from GTEx also highlighted the impact of sex on gene expression across human tissues [
53]. Fourth, the different descents of the selected donors may also have contributed to these differences. The donors in UKBEC, Mayo, and ROSMAP were of European descent. However, about 85.3% donors were of European descent, and others 14.7% were of African, Asian, and Hispanic or Latino descents, as provided in Table
1.
Using the gene expression analysis, we showed that TMEM106B had high expression in cerebellar hemisphere, tibial nerve, cerebellum, and spinal cord, but low expression in other 10 human brain tissues including frontal cortex, hypothalamus, nucleus accumbens, caudate, substantia nigra, anterior cingulate cortex, cortex, hippocampus, amygdala/amygdalae, and putamen. Hence, these findings may explain the significant eQTLs results in cerebellum. Importantly, the colocalization analysis further provided suggestive evidence of sharing the same variant with AD risk and TMEM106B expression in cerebellum.
We also realized some limitations in our study, although these above findings. First, we only conducted a sex stratification genetic association analysis using the UK Biobank GWAS summary datasets. The sex stratification datasets in IGAP are not publicly available. Meanwhile, the original GWAS genotype datasets from IGAP and UK Biobank are not publicly available, or a long time is needed to request. Hence, we could not determine the interaction between the sex and rs1990622 genotypes using the raw data. Second, our genetic association analysis identified the female-specific role of rs1990622 in AD risk, but female- or male-specific eQTLs datasets are not publicly available. Third, we performed the eQTLs analysis to investigate the role of rs1990622 variant. In fact, the modQTL analysis may also be important, as did by Yang and colleagues. However, the original gene expression datasets limit our further modQTL analysis. Hence, we will further conduct additional sex stratification analysis, female-specific eQTLs analysis, and modQTL analysis when all these datasets are publicly available.