Introduction
Amyotrophic lateral sclerosis (ALS) is a rare progressive neurodegenerative disease resulting in the loss of motor neurons in the brain and spinal cord. Death usually occurs within 3–5 years of symptom onset and there are currently no treatments to cure or prevent the disease from progressing further. The identification of pathogenic variants in genes such as
SOD1,
FUS,
TARDBP and
C9ORF72 and a range of additional genetic variants associated with disease risk have informed on the mechanisms of ALS development [
1]. Several processes, such as oxidative stress, mitochondrial dysfunction, protein aggregation, inflammation and RNA processing and toxicity, are thought to be involved in disease pathogenesis [
2]. However, the exact causation of the neurodegeneration occurring in ALS is still to be determined. One area of study aiming to understand the pathogenesis of the disease further is the dysregulation and expression of retrotransposons; a type of mobile DNA contributing to nearly half of the human genome [
3].
Retrotransposons are divided into two classes, those with long terminal repeats (LTRs) and those without (non-LTRs), both of which mobilise through a ‘copy and paste’ mechanism involving the reverse transcription of an RNA intermediate. Initial evidence for the potential involvement of retrotransposons in ALS was the detection of elevated reverse transcriptase activity in the sera of ALS patients compared to unaffected controls that was not attributed to an exogenous retroviral infection [
4,
5]. Endogenous retroviruses are part of the LTR class of elements and elevated levels of the human endogenous retrovirus-K (HERV-K) were detected in the CNS of individuals with ALS compared to controls and a mouse model overexpressing a HERV-K protein led to motor neuron degeneration and motor dysfunction in the mice [
6,
7]. Although this increased expression of HERV-K has not been observed in all studies [
8,
9].
Using RNA sequencing data the expression of multiple classes and families of retrotransposons can be analysed and this approach has identified increased expression of both LTR and non-LTR elements in subsets of individuals with ALS [
10‐
12]. Prudencio et al. analysed the expression of repetitive elements identifying an increase in several of these elements, including LTR and non-LTR retrotransposons, in the frontal cortex of individuals with ALS who were carriers of the
C9ORF72 expansion when compared to those without the expansion and healthy controls [
10]. This was further confirmed by Pereira et al. [
12]. Tam et al. utilised machine learning which identified three subtypes of ALS based on transcriptomic data from the frontal and motor cortices, one of which was characterised by the activation of retrotransposons and TAR DNA-binding protein 43 (TDP-43) dysfunction [
11]. Cytoplasmic accumulation of TDP-43, encoded by the gene
TARDBP in which mutations can cause ALS, is a hallmark of the majority of ALS cases [
13]. There is evidence in human tissues and cell lines that TDP-43 binds to retrotransposon transcripts, which is thought to aid in the repression of these elements [
14]. Analysis of the relationship of TDP-43 and transposable elements in the SHSY5Y cell line showed that TDP-43 binds to the RNA of multiple families of retrotransposons and knock-down of TDP-43 led to the upregulation of these retrotransposon targets [
11]. In addition, the loss of nuclear TDP-43 from neurons results in chromatin decondensation over long interspersed element-1 (LINE/L1) elements that belong to the non-LTR class of retrotransposons [
15].
L1s are the only elements that are still able to autonomously mobilise in the human genome and although there are more than 1 million L1s annotated in hg38 only 80–100 are considered retrotransposition competent (RC) [
16,
17]. RC-L1s encode a functional ORF1 protein that binds RNA and an ORF2 protein with reverse transcriptase and endonuclease activity, which are required for retrotransposition [
18‐
20]. The ability of RC-L1s to retrotranspose can be demonstrated in either a cellular retrotransposition assay or by identifying the source element for germline or somatic L1 insertions [
16,
21,
22]. These features of RC-L1s along with increased L1 expression detected in certain neurological conditions has led to the hypothesis that they are involved in disease processes through multiple mechanisms such as somatic retrotransposition, the triggering of neuro-inflammation and DNA damage from the endonuclease activity of ORF2 protein [
23,
24].
Our previous work focussing on highly active RC-L1s in neurodegenerative diseases has identified a reduction in the methylation of selected RC-L1s in the motor cortex of individuals with ALS compared to healthy controls [
25] and found that an increased burden of these elements in the germline is associated with Parkinson’s disease risk and progression [
26]. Although there have been several studies evaluating the expression profile of specific families of retrotransposons in ALS there is limited data on loci specific expression, which could provide additional insight into the potential consequences of their expression. Here we utilise a software tool called L1EM [
27] to evaluate loci specific L1 Homo sapiens (L1HS) expression to identify where in the genome these elements are being expressed from. L1HS are the youngest L1 subfamily containing those elements able to mobilise, therefore we can determine if the L1s expressed encode for functional proteins and have the potential to retrotranspose. Using RNA sequencing data from the Target ALS cohort provided by the New York Genome Center (
https://www.nygenome.org/) we were able to characterise the L1 expression profile in multiple tissues from individuals with or without disease and identify significant differences in L1 expression.
Discussion
The expression of transposable elements, including several families of retrotransposons, has been characterised in ALS identifying an activation and upregulation of these elements in the CNS of individuals with the disease [
6,
7,
10,
11]. There has been a limited number of studies on the specific loci of retrotransposons that are expressed. The predominant focus in ALS has been characterising loci specific HERV expression with differences observed in whether there is a significant upregulation or not of HERV loci [
6,
7,
9]. One recent study profiling the loci specific expression of HERVs identified one HERV locus (HML6_3p21.31c) that was consistently upregulated in the motor cortex and cerebellum of individuals with ALS compared to controls [
31]. Therefore, we sought to address the question of L1 loci specific expression in ALS utilising RNA-sequencing data from multiple tissues obtained from the New York Genome Center Target ALS cohort. The L1EM tool was used to profile L1HS expression in a total of 518 samples from the following tissues; motor cortex (107), frontal cortex (136), cerebellum (147) and cervical spinal cord (128). L1s are the only autonomous retrotransposons that have retained the ability to retrotranspose in the human genome and generate new insertions. Further, it is only a small number of these L1s (RC-L1s) that can mobilise, therefore identifying the specific L1 loci that are expressed will inform if those L1s encode functional ORF1 and 2 proteins and could potentially retrotranspose. Here we identified a general reduction in the total expression of intact L1s (those encoding functional proteins) in two brain regions (motor cortex and cerebellum) of those individuals with ALS compared to controls. This could suggest that L1 expression from those intact elements is potentially beneficial as it is higher in individuals not suffering from a neurological condition. For example L1s have been implicated in memory and learning as long term memory formation was impaired in mouse models using L1 inhibitors [
32]. However, there is the caveat that the window of L1 expression captured in post-mortem tissues may not reflect L1 expression throughout disease development. This observed reduction in L1 expression contrasts with studies addressing retrotransposons at the subfamily level, for example L1HS, that identified an upregulation of these elements in ALS [
11]. We analysed a very specific subset of L1s which could account for the different results observed in our study compared to previous publications. However, there were a handful of individuals with ALS or ALSND in our study that showed a large upregulation of L1s compared to the rest of the cohort which may be line with Tam et al. [
11] who demonstrated that it is a subset of ALS that is characterised by the activation of retrotransposons rather than being a feature in all cases of the disease.
When comparing the pattern of expression from specific intact L1 loci we noted that the three brain regions of the controls clustered together as did those of the individuals with ALS or ALSND (Fig.
3a). However, this was not the case when comparing the non-intact L1. This suggests that the pattern of intact L1 loci expression could be disease related, for at least, in tissues related to the brain. There are only a limited number of studies characterising human L1 loci specific expression for comparison with our study, furthermore the majority of the literature describes their expression in cell lines [
33‐
36]. In the latter, L1 expression is often restricted to a small number of elements and can be cell line dependent [
33,
36]. Mckerrow et al., who developed the L1EM tool used in this analysis, analysed L1 expression in over 120 datasets that included cell lines and tissues from the ENCODE database [
27]. The authors showed L1 expression in multiple cancer cell lines, embryonic stem cells and several lines derived from the embryonic cell lineage with limited expression detected in the tissues analysed from the ENCODE database. A sizable proportion of the intact L1 expression (17%) of all the samples in this previous publication was from a single L1 locus at chr22:28,663,283–28,669,315, which is responsible for the most somatic L1 insertions in tumours [
22,
37]. Interestingly we did not detect any expression from this L1 locus in the samples analysed here, providing evidence for tissue specific expression of these elements. This could be related to regulation of L1s by CpG methylation, for example a recent study showed that this L1 located on chr22 is hypomethylated in the liver compared to the heart and hippocampus [
38]. Our previous work analysing the methylation status of 6 highly active RC-L1s in the motor cortex identified a reduction in the DNA methylation over these elements in ALS brains compared to controls [
25]. Five of these elements located in the reference genome were also analysed as part of this study quantifying their expression levels and found that four of these L1s were expressed, however they were not part of the group of L1s responsible for the majority of the expression observed.
One of the significant characteristics of the 92 expressed L1s compared to the 308 L1s not expressed was the percentage that were in introns (69.6% vs 46.4%). We demonstrated that the level of expression of genes with an intronic expressed L1 was on average significantly higher than that of genes with an intronic L1 that was not expressed in all four tissues analysed (Fig.
6) and the level of the expressed L1 significantly positively correlated with the expression of the gene in the three brain regions analysed (Additional file
7: Figure S3). This is in agreement with the analysis by Philippe et al. who also showed that genes with an expressed L1 had higher expression than those genes with a L1 which was not expressed [
36]. Although this suggests that an L1 located in the intron of a gene, particularly if that gene is highly expressed, is more likely to be expressed in that tissue it is not a prerequisite for L1 expression as there were 28 intergenic L1s that were expressed in our study. In addition to comparing L1 expression with the expression of the gene in which they were located we also compared L1 expression with that of
TARDBP. Previous studies have shown that TDP-43 (encoded by the
TARDBP gene) regulates the expression of retrotransposons [
11,
14,
15] and that the subset of ALS characterised by retrotransposons activation also has lower levels of
TARDBP (
11). We identified that in the motor and frontal cortices lower levels of
TARDBP is significantly correlated with increased L1 expression (Fig.
7). We were limited to analysing
TARDBP expression to look for potential relationships with L1 expression as opposed to characterising TDP-43 dysfunction in terms of protein aggregation or mislocalisation due to the type of data available.
Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.