Introduction
Alzheimer’s disease (AD) is a common neurodegenerative disorder that occurs in older adults. Clinically, AD is characterized by a decline in cognitive function including memory, language, and/or visuospatial abilities. Aggregation of amyloid-β and hyperphosphorylated tau, which result in the formation of plaques and neurofibrillary tangles, respectively, represent the pathological hallmarks of AD. In addition to factors contributing to accumulation of amyloid and tau, changes in immune function resulting in increased inflammation are thought to contribute to disease pathogenesis and progression.
Common variants like
APOE ε4 are the best characterized genetic risk factors associated with AD. However, rare genetic variation, which occurs at <1 % minor allele frequency (MAF) in a given population, is becoming increasingly appreciated for its contribution to neurodegenerative disease. These infrequent variants often have more potent biological effects and can occur in genes encoding proteins intimately linked to underlying protein pathology. Rare variants that confer both risk for [
1‐
5] and protection from [
6,
7] different forms of neurodegeneration have been identified, but, due to their low MAF, most of these studies required very large cohorts to confirm the effect of these single variants on disease.
TREM2 is a widely studied gene known to harbor rare variation that can either cause or contribute to risk for distinct neurodegenerative diseases. Homozygous or compound heterozygous mutations in
TREM2 are known to cause Nasu-Hakola disease (NHD) or an early-onset frontotemporal dementia (FTD)-like syndrome, while rare variation in
TREM2 increases risk for AD, and may also increase risk for FTD, Parkinson’s disease, and amyotrophic lateral sclerosis [
8‐
10]. In the brain, TREM2 is an innate immune system receptor expressed primarily on microglia [
11]. It has been implicated in sensing damage signals, promoting microglial survival, and regulating central nervous system inflammation [
12‐
14]. In particular, the R47H variant in
TREM2 has been associated with AD risk in populations of European descent [
4,
5], and is thought to alter microglial function [
13,
15]. Recent evidence suggests that the R47H variant acts by altering TREM2’s ability to bind lipoproteins and apolipoproteins, which may ultimately prevent microglia from efficiently absorbing amyloid-β-lipoprotein complexes [
16].
Assessment of mutation burden can alleviate requirements for large cohorts by accounting for the overall risk contribution of rare and even unique variation observed in the same gene but in different individuals. Gene-based analysis offers the unique advantage of weighing the combined effects of multiple variants (common and/or rare) into a single statistical measure of disease risk [
17,
18]. Combining rare variants into a single analysis increases power to detect disease-associated risk in a gene using a relatively small cohort [
17]. Furthermore, characterizing distinct rare variants occurring within the same functional domain of a particular protein may offer additional insight into shared pathogenic mechanisms. Of the available gene-based tests, the sequence kernel association test (SKAT) and its variants have proven reliable under multiple cohort sizes and have high mean power when compared to other tests [
18‐
20].
In this study, we assessed deep sequencing data from over 150 genes previously linked to neurodegenerative, neuropsychiatric, and neurodevelopmental phenotypes for rare variant burden contributing to AD. We confirmed that mutation burden in TREM2 is robustly associated with AD risk in two independent cohorts. We then characterized biochemically a subset of rare TREM2 variants to test whether they alter cell surface expression as a means of assessing their functional significance. Our analysis showed that several of the rare variants identified in AD indeed significantly reduced overall expression as well as cell surface expression of TREM2, suggesting that these variants may reduce protein function and contribute to disease risk.
Materials and methods
Participants and clinical assessment
For the discovery genetic analysis, 115 males and 161 females were evaluated at the University of California, San Francisco Memory and Aging Center (UCSF MAC), and had genetic data available for analysis. All participants underwent clinical assessment with an in-person visit at the UCSF MAC that included a neurologic exam, cognitive assessment [
21,
22] and medical history. Each participant’s study partner was also interviewed regarding functional abilities. A multidisciplinary team composed of a neurologist, neuropsychologist, and nurse then established clinical diagnoses for cases according to consensus criteria for AD and its subtypes [
23,
24]. All healthy controls underwent a similar assessment, including study partner interview, and a consensus team of clinicians then established clinical diagnosis of cognitively normal. Controls in this study had Mini-Mental State Exam (MMSE) [
25] scores ≥26 or a Clinical Dementia Rating Scale (CDR) [
26] of 0, no participant or informant report of cognitive concerns or decline in the prior year, and no evidence from clinical visit suggesting a neurodegenerative disorder (per team neurologist). Detailed demographic information is included in Table
1. Individuals harboring a known disease mutation or with a family history of neurodegeneration were excluded from the study.
Table 1
Study participant characteristics
Discovery (UCSF) | N | 31 | 245 | |
Age at Onset / First Visit | 77.8 ± 4.5 | 68.5 ± 8.5 |
p < 0.001 |
Sex (M / F) | 16 / 15 | 99 / 146 | 0.18 |
Edu (Years, Mean ± SD) | 17.0 ± 3.7 | 17.3 ± 2.1 | 0.45 |
CDR (Mean ± SD) | 0.8 ± 0.3 | 0.0 ± 0.1 |
p < 0.001 |
MMSE (Mean ± SD) | 22.2 ± 5.3 | 29.4 ± 0.8 |
p < 0.001 |
APOE ε4 dose (0 / 1 / 2) | 9 / 17 / 5 | 190 / 48 / 4 |
p < 0.001 |
# Pathological Confirmed AD | 12 | | |
Replication (ADSP) | N | 2927 | 2633 | |
Age at Onset / First Visit | 75.3 ± 8.4 | 85.5 ± 5.1 |
p < 0.001 |
Sex (M / F) | 1299 / 1628 | 1185 / 1448 | 0.639334 |
Edu (Years, Mean ± SD) | NA | NA | NA |
CDR (Mean ± SD) | NA | NA | NA |
MMSE (Mean ± SD) | NA | NA | NA |
APOE ε4 dose (0 / 1 / 2) | 1660 / 1184 / 83 | 2239 / 386 / 8 |
p < 0.001 |
# Pathological Confirmed AD | 1057 | | |
Replication analysis was performed on samples from the case–control component of the Alzheimer’s Disease Sequencing Project (ADSP), a Presidential Initiative established to identify new genes and alleles contributing to AD risk, AD protection, and targets for new AD therapies, particularly for late-onset AD. The discovery phase of this project generated whole exome sequencing (WES) data for 10,061 unrelated individuals (
N = 5,096 cases,
N = 4,965 controls) from the Alzheimer’s Disease Genetics Consortium and the Cohorts for Heart Aging Research in Genomic Epidemiology consortia, of which 5,560 are included in the replication analysis (see Table
1 for cohort demographics). All cases met criteria for probable or definite AD based on clinical assessment, or had neuropathological features of AD upon brain autopsy. Pathological staging was made according to criteria set forth in Braak and Braak (1995) [
27]. Cases received a Braak staging score greater than or equal to 3. All controls were clinically assessed for dementia or had an absence of neuropathological AD features upon autopsy (Braak score of 2 or less). Individuals carrying a known disease mutation were excluded from the analyses. All sample phenotype and demographic data were obtained from dbGAP (study accession phs000572.v6.p4; table accession pht004306.v4.p4.c1).
All participants in both analyses were unrelated white individuals (confirmed by identity-by-descent testing in the replication analysis or self-described for those without GWAS data available). Non-Caucasian individuals were excluded due to the insufficient number of participants and potential for confounding background genetics. All aspects of the study were approved by the UCSF Institutional Review Board and written informed consent was obtained from all participants and surrogates (as per UCSF Institutional Review Board protocol).
Sequencing
The UCSF cohort was screened using targeted sequencing of more than 150 RefSeq genes previously implicated in neurodegenerative dementia, including the most common causative genes for Mendelian forms of AD and FTD. Exonic regions for these genes were captured using a custom-designed Nimblegen SeqCap EZ Choice (Roche) library and sequenced on an Illumina HiSeq2500 at the UCLA Neuroscience Genomics Core (Los Angeles, CA). Sequence reads were mapped to the GRCh37/hg19 reference genome and variants were interactively joint-called with GATK according to GATK Best Practices recommendations (
https://www.broadinstitute.org/gatk/ [
28]).
ADSP samples underwent WES at one of three NHGRI funded large-scale sequencing centers at Baylor, the Broad Institute, or Washington University. Whole exome capture was performed using either the Illumina Rapid Capture Exome kit or VCRome v2.1 kit (Nimblegen), and paired-end reads were generated using an Illumina HiSeq 2000. Sequence reads were aligned to the GRCh37 reference genome using the Burrows-Wheeler aligner [
29], and variants were jointly called across the entire cohort using Atlas V2 software (Baylor) or GATK (Broad). Variants underwent pipeline-specific quality control prior to merging the variants that were concordant between the two sets of variants. The ADSP also performed initial quality control checks on sample information, phenotypes, and genotype data to ensure that these data were of high quality and suitable for downstream analysis.
Quality control and post-processing
After joint-calling, variants were filtered according to previously established criteria [
30]. Briefly, we kept joint-called variants with genotype quality (GQ) scores greater than 30 and read depth (DP) scores greater than 20. The resulting file was annotated with gene names using the Variant Effect Predictor in Ensembl. The predicted effect of each variant were determined using PolyPhen and SIFT. Prior to analysis, we used PLINK [
31] to remove individuals with genotyping rates below 95 %, SNPs (single nucleotide polymorphisms) with genotyping rates below 95, and SNPs with a MAF greater than 1 %. Gene SNP sets were created from exonic SNPs classified as missense and nonsense variants. For our replication analysis, we created SNP sets using the same genes that were available for study in our discovery cohort.
Genetic analyses
Following previously published criteria [
18], we limited our analyses to gene SNP sets with 4 or more SNPs available. For our discovery analysis we conducted a SKAT analysis in the amnestic AD cohort from UCSF. Our replication analysis in the ADSP amnestic AD cohort used the same testing parameters and techniques. We repeated the aforementioned analysis in subset of the ADSP cohort that had pathologically confirmed AD. Finally, to test whether rare variation in
TREM2 is associated with clinically heterogeneous AD, we ran a SKAT analysis in a subset of the cohort which included amnestic AD, early-onset AD, executive (frontal) AD, and the logopenic variant of primary progressive aphasia (lvPPA).
Antibodies
The HA.11 monoclonal antibody used to detect HA-TREM2 was from Covance, and the clathrin heavy chain monoclonal antibody was from BD Transduction Laboratories.
Molecular biology
The human TREM2 cDNA was obtained from R&D Systems, amplified by PCR and inserted into the pEGFP-N1 vector after first removing the EGFP coding sequence. To facilitate detection of TREM2, an HA epitope tag and linker sequence identical to that used in Kleinberger et al. [
14], were inserted after the TREM2 signal peptide using the Phusion high-fidelity DNA polymerase (NEB) system for site-directed mutagenesis. All TREM2 variants were similarly generated using Phusion, with the HA-TREM2 construct serving as template DNA. All constructs were verified by sequencing at the UC Berkeley DNA Sequencing Facility.
Cell culture
HEK-293T cells were maintained at the UC Berkeley Cell Culture Facility under standard conditions. Cells were transiently transfected using Lipofectamine 2000 (ThermoFisher) according to the manufacturer’s instructions. Culture medium was typically changed 4 h after transfection, and experiments were carried out the following day.
Immunoblotting
Cells were harvested on ice by washing with cold PBS followed by lysing in a buffer containing 100 mM NaCl, 10 mM Tris-Cl, pH 7.6, 1 % (v/v) Triton X-100 and Complete protease inhibitor cocktail (Roche). Triton-insoluble material was sedimented by centrifugation at 20,000 g for 10 min at 4 °C. Supernatants were mixed with 5X SDS-PAGE sample buffer supplemented with DTT, then heated at 55 °C for 10 min prior to running in 4–20 % acrylamide gradient gels (Life Technologies and Bio-Rad). After SDS-PAGE, proteins were transferred onto PVDF membranes (EMD Millipore), blocked in 5 % non-fat milk (dissolved in PBS containing 0.1 % Tween-20), and probed with HA and CHC antibodies at 1:2,500 and 1:10,000, respectively. Blots were developed using enhanced chemiluminescence and imaged on a ChemiDoc digital imager (Bio-Rad). Protein signals were quantified using ImageJ (NIH). For overall TREM2 expression analysis, the TREM2 signals derived from cell lysates were first normalized to the corresponding CHC signal, then calculated as a fraction of the WT signal.
Cell surface biotinylation
Cell surface biotinylation was carried out in a manner similar to that performed in Kleinberger et al., 2014. Briefly, cells were washed at room temperature (RT) with PBS and labeled with the EZ-Link Sulfo-NHS-SS-Biotin reagent (ThermoFisher) at 1 mg/ml in PBS for 15 min. Cells were then placed on ice, washed with cold Tris-buffered saline to quench the biotin reagent, then washed with cold PBS and finally lysed and clarified as described above. To capture biotinylated proteins, Strep-Tactin resin (iba) was added to the clarified lysates and the mixtures rotated at 4 °C for 1 h. The resin was then pelleted and washed multiple times with lysis buffer. Finally, 2X SDS-PAGE sample buffer supplemented with DTT was added to the washed resin, and the samples were vortexed, heated and prepared for immunoblotting as described above. For the analysis of surface-labeled TREM2, we quantified the entire surface-labeled signal (including mature and immature bands) by densitometry and normalized the signal of individual variants to the WT signal.
Statistical analysis
We used the “SKAT” package [
18] in R to conduct all gene-based association tests. The SKAT package allows users to conduct sequence kernel association tests, which are powerful when a portion of the variants in a region are noncausal or variant effects are in different directions. All genetic analyses using in the MAC and ADSP cohorts were completed using R.
Protein expression analyses and plots were completed using Graphpad Prism 6 (La Jolla, CA). Protein expression differences were established with ANOVA tests. We used the Holm-Sidak method for our post hoc testing.
Discussion
We confirmed association of aggregate rare variation in TREM2 with AD in two independent cohorts, including in a subset of individuals with pathologically confirmed AD. Two of the variants identified in AD, S31F and R47C, have not, to our knowledge, been described before. In addition, the R136Q variant identified in an atypical form of AD has not been previously characterized at the protein level. Using heterologous expression, we found that these three variants show a significant reduction in cell surface expression relative to WT TREM2.
Rare homozygous or compound heterozygous mutations in
TREM2 cause NHD or an early-onset FTD syndrome without bone involvement [
8,
34]. These include missense mutations such as Y38C, T66M and D86V that occur within the Ig-like domain of TREM2 [
32,
35,
36]. Variants Y38C and T66M have been shown to have impaired cell surface expression [
14,
37], and we now demonstrate that the novel variants S31F and R47C, which also localize to the Ig-like domain, show reduced surface expression. Thus, it is possible that modestly reduced TREM2 cell surface expression in heterozygotes increases risk for late-onset neurodegeneration, while severely reduced surface expression in homozygotes leads to early-onset FTD or NHD. By extension, we hypothesize that homozygous carriers of S31F, R47C or R136Q, if identified, might be at greater risk for AD neurodegeneration, relative to heterozygotes. In contrast to the above variants, variant A28V, which was also identified in an AD case, showed significantly increased surface expression. It is thus currently unclear if this variant is impaired in another way (e.g., defective ligand binding) or if it contributes risk for disease.
In all of our surface expression analyses, we observed that the immature form of TREM2 was capable of reaching the cell surface. Although this was reported previously for disease-causing variants Y38C and T66M, it was not observed for the WT protein [
14]. We speculate that this discrepancy may be due to the different expression systems used (transient expression in this paper vs. stable expression in [
14]). Importantly, however, we confirmed the strong reduction in surface expression for variant Y38C that was reported previously in Kleinberger et al. [
14] and Park et al. [
37], indicating the suitability of our expression method for cell surface labeling.
TREM2 variant R136Q was identified in one patient with a language-predominant form of AD, the logopenic variant of primary progressive aphasia (lvPPA, [
24]). We also identified R136Q and R136W in one control each, underscoring the point that these variants do not appear to be causative for disease. However, others have previously reported both of these variants in amnestic AD [
2,
5], and they have MAFs of 0.001278 and 0.0001381, respectively (2 observations in 1,564 alleles and 1 observations in 7,240 alleles, respectively) in the Exome Aggregation Consortium (ExAC) database [
38]. The reported MAF of R47H is about 20-fold greater than that reported for R136Q and is consistent with our observation of only one case harboring this variant.
We utilized an aggregate variant burden test implemented in the SKAT program to assess the effects of variation across multiple genes—including TREM2—on risk for AD. Advantages of this package include that it makes no assumption about the direction or magnitude of an effect and its ability to account for both a large fraction of noncausal variants and causal variant effects that are in different directions. Some limitations occur when the number of SNPs required in each set results in exclusion of candidate genes. Analysis of larger cohorts with deep resequencing data will be required to expand coverage of rare variation across more genes.
Our finding that some variants do not alter cell-surface expression does not preclude these variants from altering AD risk via other mechanisms. For instance, the V27M and E151K variants did not show significantly reduced surface expression, but may be defective for ligand binding, as has been shown recently for R47H and other variants [
16,
39,
40]. Variant A28V, identified in an AD case and showing increased surface expression, may increase risk for disease by adversely affecting ligand binding, or, alternatively, may not affect risk for disease. Future functional studies such as lipoprotein binding and uptake assays will be required to further characterize the effects of the identified variants. We also identified several variants in controls that will require further genetic and functional characterization to determine whether they are likely to alter disease risk. For example, the D87N variant identified in both cases and controls in our cohorts, has recently been shown to display a defect in ligand binding [
16] and may thus represent an AD risk variant.
Our study benefits from the analysis of multiple cohorts representing both amnestic and atypical forms of AD, pathological confirmation in a subset of individuals from the replication cohort, and the ability to assess biochemically the effect of select variants on protein expression and cell surface expression. Caveats of the study include a limited number of patients in the discovery cohort—particularly of atypical AD syndromes—and, as mentioned above, the limited scope of genes analyzed.
Acknowledgements
We thank contributors who collected samples used in this study, as well as patients and their families, whose help and participation made this work possible.