Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Evolution to the rescue: using comparative genomics to understand long non-coding RNAs

This article has been updated

Key Points

  • Long non-coding RNAs (lncRNAs) are emerging as important regulators in multiple key pathways. Thousands of lncRNA genes have now been identified in dozens of species, including animals, plants and single-celled organisms.

  • Unlike protein-coding genes and non-coding RNAs such as microRNAs, lncRNAs are rapidly lost and gained during evolution, raising questions about how many of them are functional.

  • Within conserved lncRNAs, exon–intron architectures and sequences are also rapidly turned over with only short regions evolving under purifying selection.

  • Across long evolutionary distances, there are numerous lncRNAs that are found in syntenic regions, but exhibit no detectable sequence similarity. These can correspond to loci where only the act of transcription is important, or to lncRNAs that depend on very short sequence elements for their functions.

  • Evolutionary trajectories can be used to classify lncRNAs into groups with different characteristics and probably different modes of action.

  • In most cases tested so far, lncRNA function was maintained across large evolutionary distances even when the lncRNA sequence substantially diverged.

Abstract

Long non-coding RNAs (lncRNAs) have emerged in recent years as major players in a multitude of pathways across species, but it remains challenging to understand which of them are important and how their functions are performed. Comparative sequence analysis has been instrumental for studying proteins and small RNAs, but the rapid evolution of lncRNAs poses new challenges that demand new approaches. Here, I review the lessons learned so far from genome-wide mapping and comparisons of lncRNAs across different species. I also discuss how comparative analyses can help us to understand lncRNA function and provide practical considerations for examining functional conservation of lncRNA genes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: A generic pipeline for the identification of lncRNAs from RNA-seq data.
Figure 2: Classes of lncRNA conservation.
Figure 3: Pathways for origination and diversification of lncRNA loci.
Figure 4: Manifestations of conserved functionality in lncRNA genes.

Similar content being viewed by others

Change history

  • 06 September 2016

    In the original version of this article, the sentence “A study using a different background model recently reported more than 4 million regions that are evolving under selection to preserve secondary structure” (section ‘Secondary structure and its conservation’) was missing a citation of reference 65 (Smith, M. A., Gesell, T., Stadler, P. F. & Mattick, J. S. Widespread purifying selection on RNA structure in mammals. Nucleic Acids Res. 41, 8220–8236 (2013)). This citation dropped out during journal typesetting of the article and has now been reinstated. The editors apologize for this error.

References

  1. Iyer, M. K. et al. The landscape of long noncoding RNAs in the human transcriptome. Nat. Genet. 47, 199–208 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Hezroni, H. et al. Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species. Cell Rep. 11, 1110–1122 (2015). This study compares features and loci of lncRNAs across various vertebrates and shows rapid lncRNA turnover combined with conservation of expression patterns, and positional conservation without sequence conservation across large evolutionary distances.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Cabili, M. N. et al. Localization and abundance analysis of human lncRNAs at single-cell and single-molecule resolution. Genome Biol. 16, 20 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Cabili, M. N. et al. Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. Genes Dev. 25, 1915–1927 (2011). This study provides the first comprehensive RNA-seq-based catalogue of human lncRNAs and characterizes their features.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Gong, J., Liu, W., Zhang, J., Miao, X. & Guo, A. Y. lncRNASNP: a database of SNPs in lncRNAs and their potential functions in human and mouse. Nucleic Acids Res. 43, D181–D186 (2015).

    Article  CAS  PubMed  Google Scholar 

  6. Wapinski, O. & Chang, H. Y. Long noncoding RNAs and human disease. Trends Cell Biol. 21, 354–361 (2011).

    Article  CAS  PubMed  Google Scholar 

  7. Pasquinelli, A. E. et al. Conservation of the sequence and temporal expression of let-7 heterochronic regulatory RNA. Nature 408, 86–89 (2000).

    Article  CAS  PubMed  Google Scholar 

  8. Auyeung, V. C., Ulitsky, I., McGeary, S. E. & Bartel, D. P. Beyond secondary structure: primary-sequence determinants license pri-miRNA hairpins for processing. Cell 152, 844–858 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Bartel, D. P. MicroRNAs: target recognition and regulatory functions. Cell 136, 215–233 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Berezikov, E. Evolution of microRNA diversity and regulation in animals. Nat. Rev. Genet. 12, 846–860 (2011).

    Article  CAS  PubMed  Google Scholar 

  11. Yang, Z. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15, 568–573 (1998).

    Article  CAS  PubMed  Google Scholar 

  12. Necsulea, A. et al. The evolution of lncRNA repertoires and expression patterns in tetrapods. Nature 505, 635–640 (2014).

    Article  CAS  PubMed  Google Scholar 

  13. Washietl, S., Kellis, M. & Garber, M. Evolutionary dynamics and tissue specificity of human long noncoding RNAs in six mammals. Genome Res. 24, 616–628 (2014). References 12 and 13 are studies that comprehensively compare lncRNA sequence and expression evolution in various tetrapods.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Bu, D. et al. Evolutionary annotation of conserved long non-coding RNAs in major mammalian species. Sci. China Life Sci. 58, 787–798 (2015).

    Article  CAS  PubMed  Google Scholar 

  15. Ulitsky, I. & Bartel, D. P. lincRNAs: genomics, evolution, and mechanisms. Cell 154, 26–46 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Jenkins, A. M., Waterhouse, R. M. & Muskavitch, M. A. Long non-coding RNA discovery across the genus Anopheles reveals conserved secondary structures within and beyond the Gambiae complex. BMC Genomics 16, 337 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Liu, J. et al. Genome-wide analysis uncovers regulation of long intergenic noncoding RNAs in Arabidopsis. Plant Cell 24, 4333–4345 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Brown, J. B. et al. Diversity and dynamics of the Drosophila transcriptome. Nature 512, 393–399 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Ravasi, T. et al. Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res. 16, 11–19 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Zhao, W. et al. Comparison of RNA-seq by poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling. BMC Genomics 15, 419 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-seq. Bioinformatics 25, 1105–1111 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kim, D., Langmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Engstrom, P. G. et al. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat. Methods 10, 1185–1191 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Housman, G. & Ulitsky, I. Methods for distinguishing between protein-coding and long noncoding RNAs and the elusive biological purpose of translation of long noncoding RNAs. Biochim. Biophys. Acta 1859, 31–40 (2015).

    Article  CAS  PubMed  Google Scholar 

  29. Kanitz, A. et al. Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data. Genome Biol. 16, 150 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Chen, J. et al. Evolutionary analysis across mammals reveals distinct classes of long non-coding RNAs. Genome Biol. 17, 19 (2016). This study demonstrates a new methodology for detailed comparison of lncRNAs expressed in pluripotent stem cells in several species and suggests a classification of lncRNAs into groups based on their evolutionary histories.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Jayakodi, M. et al. Genome-wide characterization of long intergenic non-coding RNAs (lincRNAs) provides new insight into viral diseases in honey bees Apis cerana and Apis mellifera. BMC Genomics 16, 680 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Mohammadin, S., Edger, P. P., Pires, J. C. & Schranz, M. E. Positionally-conserved but sequence-diverged: identification of long non-coding RNAs in the Brassicaceae and Cleomaceae. BMC Plant Biol. 15, 217 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Wang, H. et al. Analysis of non-coding transcriptome in rice and maize uncovers roles of conserved lncRNAs associated with agriculture traits. Plant J. 84, 404–416 (2015).

    Article  CAS  PubMed  Google Scholar 

  34. Paytuvi Gallart, A., Hermoso Pulido, A., Anzar Martinez de Lagran, I., Sanseverino, W. & Aiese Cigliano, R. GREENC: a Wiki-based database of plant lncRNAs. Nucleic Acids Res. 44, D1161–D1166 (2016).

    Article  CAS  PubMed  Google Scholar 

  35. Bråte, J., Adamski, M., Neumann, R. S., Shalchian-Tabrizi, K. & Adamska, M. Regulatory RNA at the root of animals: dynamic expression of developmental lincRNAs in the calcisponge Sycon ciliatum. Proc. Biol. Sci. 282, 20151746 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Gaiti, F. et al. Dynamic and widespread lncRNA expression in a sponge and the origin of animal complexity. Mol. Biol. Evol. 32, 2367–2382 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Guttman, M. et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223–227 (2009). This is the first study to use chromatin marks to improve the identification of lncRNAs in mouse and provides a detailed description of a set of lncRNAs that were better conserved than background.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Marques, A. C. & Ponting, C. P. Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness. Genome Biol. 10, R124 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Gardner, P. P. et al. Conservation and losses of non-coding RNAs in avian genomes. PLoS ONE 10, e0121797 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Haerty, W. & Ponting, C. P. Mutations within lncRNAs are effectively selected against in fruitfly but not in human. Genome Biol. 14, R49 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  41. Zhang, Y. C. et al. Genome-wide screening and functional analysis identify a large number of long noncoding RNAs involved in the sexual reproduction of rice. Genome Biol. 15, 512 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Ponjavic, J., Ponting, C. P. & Lunter, G. Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res. 17, 556–565 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Wang, J. et al. Mouse transcriptome: neutral evolution of 'non-coding' complementary DNAs. Nature http://dx.doi.org/10.1038/nature03016 (2004).

  44. Managadze, D., Rogozin, I. B., Chernikova, D., Shabalina, S. A. & Koonin, E. V. Negative correlation between expression level and evolutionary rate of long intergenic noncoding RNAs. Genome Biol. Evol. 3, 1390–1404 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Kutter, C. et al. Rapid turnover of long noncoding RNAs and the evolution of gene expression. PLoS Genet. 8, e1002841 (2012). This study compares in detail lncRNAs that are expressed in the liver in three rodents and reports rapid evolutionary turnover of lncRNAs, even when the same tissue is compared across closely related species.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Morán, I. et al. Human β cell transcriptome analysis uncovers lncRNAs that are tissue-specific, dynamically regulated, and abnormally expressed in type 2 diabetes. Cell. Metab. 16, 435–448 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Mustafi, D. et al. Evolutionarily conserved long intergenic non-coding RNAs in the eye. Hum. Mol. Genet. 22, 2992–3002 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Tan, J. Y. et al. Extensive microRNA-mediated crosstalk between lncRNAs and mRNAs in mouse embryonic stem cells. Genome Res. 25, 655–666 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Thomson, D. W. & Dinger, M. E. Endogenous microRNA sponges: evidence and controversy. Nat. Rev. Genet. 17, 272–283 (2016).

    Article  CAS  PubMed  Google Scholar 

  50. Yang, J. R. & Zhang, J. Human long noncoding RNAs are substantially less folded than messenger RNAs. Mol. Biol. Evol. 32, 970–977 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Spitale, R. C. et al. Structural imprints in vivo decode RNA regulatory mechanisms. Nature 519, 486–490 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Wilusz, J. E. et al. A triple helix stabilizes the 3′ ends of long noncoding RNAs that lack poly(A) tails. Genes Dev. 26, 2392–2407 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Ilik, I. A. et al. Tandem stem-loops in roX RNAs act together to mediate X chromosome dosage compensation in Drosophila. Mol. Cell 51, 156–173 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Park, S. W., Kuroda, M. I. & Park, Y. Regulation of histone H4 Lys16 acetylation by predicted alternative secondary structures in roX noncoding RNAs. Mol. Cell. Biol. 28, 4952–4962 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Zhao, J., Sun, B. K., Erwin, J. A., Song, J. J. & Lee, J. T. Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science 322, 750–756 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Maenner, S. et al. 2D structure of the A region of Xist RNA and its implication for PRC2 association. PLoS Biol. 8, e1000276 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Lu, Z. et al. RNA duplex map in living cells reveals higher-order transcriptome structure. Cell 165, 1267–1279 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Torarinsson, E., Sawera, M., Havgaard, J. H., Fredholm, M. & Gorodkin, J. Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure. Genome Res. 16, 885–889 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Miller, W. et al. 28-way vertebrate alignment and conservation track in the UCSC Genome Browser. Genome Res. 17, 1797–1808 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Gorodkin, J. et al. De novo prediction of structured RNAs from genomic sequences. Trends Biotechnol. 28, 9–19 (2010).

    Article  CAS  PubMed  Google Scholar 

  61. Stadler, P. F. in Advances in Bioinformatics and Computational Biology (eds Ferreira, C. E. et al.) 1–12 (Springer, 2010).

    Book  Google Scholar 

  62. Lee, S. et al. Noncoding RNA NORAD regulates genomic stability by sequestering PUMILIO proteins. Cell 164, 69–80 (2016).

    Article  CAS  PubMed  Google Scholar 

  63. Tichon, A. et al. A conserved abundant cytoplasmic long noncoding RNA modulates repression by Pumilio proteins in human cells. Nat. Commun. 7, 12209 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Nam, J. W. & Bartel, D. P. Long noncoding RNAs in C. elegans. Genome Res. 22, 2529–2540 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Smith, M. A., Gesell, T., Stadler, P. F. & Mattick, J. S. Widespread purifying selection on RNA structure in mammals. Nucleic Acids Res. 41, 8220–8236 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Somarowthu, S. et al. HOTAIR forms an intricate and modular secondary structure. Mol. Cell 58, 353–361 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Rivas, E., Clements, J. & Eddy, S. R. Lack of evidence for conserved secondary structure in long noncoding RNAs. Preprint at http://eddylab.org/publications/RivasEddy16/RivasEddy16-preprint.pdf (2016).

  68. Quinn, J. J. et al. Rapid evolutionary turnover underlies conserved lncRNA-genome interactions. Genes Dev. 30, 191–207 (2016). This study uses a novel computational approach for the sensitive detection of lncRNA homologues in insects and vertebrates based on a combination of synteny, sequence and structural information, and includes the first comparison of genomic binding sites of lncRNAs across species.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Tycowski, K. T., Shu, M. D., Borah, S., Shi, M. & Steitz, J. A. Conservation of a triple-helix-forming RNA stability element in noncoding and genomic RNAs of diverse viruses. Cell Rep. 2, 26–32 (2012). This study describes a sensitive approach for using a specific sequence-structure pattern to identify lncRNA homologues among extensively divergent viral genomes.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Kornienko, A. E., Guenzl, P. M., Barlow, D. P. & Pauler, F. M. Gene regulation by the act of long non-coding RNA transcription. BMC Biol. 11, 59 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Latos, P. A. et al. Airn transcriptional overlap, but not its lncRNA products, induces imprinted Igf2r silencing. Science 338, 1469–1472 (2012). This is the most comprehensive study to date of a lncRNA for which only the act of transcription, and not any particular part of the sequence, is important for function.

    Article  CAS  PubMed  Google Scholar 

  72. Haerty, W. & Ponting, C. P. Unexpected selection to retain high GC content and splicing enhancers within exons of multiexonic lncRNA loci. RNA 21, 333–346 (2015).

    Article  CAS  PubMed  Google Scholar 

  73. Ulitsky, I., Shkumatava, A., Jan, C. H., Sive, H. & Bartel, D. P. Conserved function of lincRNAs in vertebrate embryonic development despite rapid sequence evolution. Cell 147, 1537–1550 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. He, Y. et al. The conservation and signatures of lincRNAs in Marek's disease of chicken. Sci. Rep. 5, 15184 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Jiang, W., Liu, Y., Liu, R., Zhang, K. & Zhang, Y. The lncRNA DEANR1 facilitates human endoderm differentiation by activating FOXA2 expression. Cell Rep. 11, 137–148 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Sone, M. et al. The mRNA-like noncoding RNA Gomafu constitutes a novel nuclear domain in a subset of neurons. J. Cell Sci. 120, 2498–2506 (2007).

    Article  CAS  PubMed  Google Scholar 

  77. Paralkar, V. R. et al. Unlinking an lncRNA from its associated cis element. Mol. Cell 62, 104–110 (2008).

    Article  CAS  Google Scholar 

  78. Yin, Y. et al. Opposing roles for the lncRNA Haunt and its genomic locus in regulating HOXA gene activation during embryonic stem cell differentiation. Cell Stem Cell 16, 504–516 (2015).

    Article  CAS  PubMed  Google Scholar 

  79. Marques, A. C. et al. Chromatin signatures at transcriptional start sites separate two equally populated yet distinct classes of intergenic long noncoding RNAs. Genome Biol. 14, R131 (2013). This paper describes a classification of currently annotated lncRNAs into two groups (promoter-associated and enhancer-associated) with different features based on the chromatin signatures at their transcription start sites.

    Article  PubMed  PubMed Central  Google Scholar 

  80. Legeai, F. & Derrien, T. Identification of long non-coding RNAs in insects genomes. Curr. Opin. Insect Sci. 7, 37–44 (2015).

    Article  PubMed  Google Scholar 

  81. Li, L. et al. Genome-wide discovery and characterization of maize long non-coding RNAs. Genome Biol. 15, R40 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  82. Wang, M. et al. Long noncoding RNAs and their proposed functions in fibre development of cotton (Gossypium spp.). New Phytol. 207, 1181–1197 (2015).

    Article  CAS  PubMed  Google Scholar 

  83. Long, M., VanKuren, N. W., Chen, S. & Vibranovski, M. D. New gene evolution: little did we know. Annu. Rev. Genet. 47, 307–333 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Kaessmann, H. Origins, evolution, and phenotypic impact of new genes. Genome Res. 20, 1313–1326 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  85. Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012). This article provides a comprehensive description of lncRNA features and subcellular localization based on the Encyclopedia of DNA Elements (ENCODE) project data.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Duret, L., Chureau, C., Samain, S., Weissenbach, J. & Avner, P. The Xist RNA gene evolved in eutherians by pseudogenization of a protein-coding gene. Science 312, 1653–1655 (2006). The paper is the first example of a lncRNA that evolved from a loss of coding potential of an ancestral protein-coding gene.

    Article  CAS  PubMed  Google Scholar 

  87. Romito, A. & Rougeulle, C. Origin and evolution of the long non-coding genes in the X-inactivation center. Biochimie 93, 1935–1942 (2011).

    Article  CAS  PubMed  Google Scholar 

  88. Cordaux, R. & Batzer, M. A. The impact of retrotransposons on human genome evolution. Nat. Rev. Genet. 10, 691–703 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Kelley, D. R. & Rinn, J. L. Transposable elements reveal a stem cell specific class of long noncoding RNAs. Genome Biol. 13, R107 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Kapusta, A. et al. Transposable elements are major contributors to the origin, diversification, and regulation of vertebrate long noncoding RNAs. PLoS Genet. 9, e1003470 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Young, J. M. et al. DUX4 binding to retroelements creates promoters that are active in FSHD muscle and testis. PLoS Genet. 9, e1003947 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  92. Seila, A. C. et al. Divergent transcription from active promoters. Science 322, 1849–1851 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Jensen, T. H., Jacquier, A. & Libri, D. Dealing with pervasive transcription. Mol. Cell 52, 473–484 (2013).

    Article  CAS  PubMed  Google Scholar 

  94. Wu, X. & Sharp, P. A. Divergent transcription: a driving force for new gene origination? Cell 155, 990–996 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Gotea, V., Petrykowska, H. M. & Elnitski, L. Bidirectional promoters as important drivers for the emergence of species-specific transcripts. PLoS ONE 8, e57323 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  96. Ruf, S. et al. Large-scale analysis of the regulatory architecture of the mouse genome with a transposon-associated sensor. Nat. Genet. 43, 379–386 (2011).

    Article  CAS  PubMed  Google Scholar 

  97. Soumillon, M. et al. Cellular source and mechanisms of high transcriptome complexity in the mammalian testis. Cell Rep. 3, 2179–2190 (2013).

    Article  CAS  PubMed  Google Scholar 

  98. Johnson, R. & Guigo, R. The RIDL hypothesis: transposable elements as functional domains of long noncoding RNAs. RNA 20, 959–976 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Elisaphenko, E. A. et al. A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements. PLoS ONE 3, e2521 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Carrieri, C. et al. Long non-coding antisense RNA controls Uchl1 translation through an embedded SINEB2 repeat. Nature 491, 454–457 (2012).

    Article  CAS  PubMed  Google Scholar 

  101. Holdt, L. M. et al. Alu elements in ANRIL non-coding RNA at chromosome 9p21 modulate atherogenic cell functions through trans-regulation of gene networks. PLoS Genet. 9, e1003588 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Hacisuleyman, E., Shukla, C. J., Weiner, C. L. & Rinn, J. L. Function and evolution of local repeats in the Firre locus. Nat. Commun. 7, 11021 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  103. Hacisuleyman, E. et al. Topological organization of multichromosomal regions by the long intergenic noncoding RNA Firre. Nat. Struct. Mol. Biol. 21, 198–206 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Memczak, S. et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature 495, 333–338 (2013).

    Article  CAS  PubMed  Google Scholar 

  105. Chodroff, R. A. et al. Long noncoding RNA genes: conservation of sequence and brain expression among diverse amniotes. Genome Biol. 11, R72 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  106. Bassett, A. R. et al. Considerations when investigating lncRNA function in vivo. eLife 3, e03058 (2014). This paper provides important practical guidelines for choosing methods for perturbing lncRNA functions and interpreting the results.

    Article  PubMed  PubMed Central  Google Scholar 

  107. Goto, T. & Monk, M. Regulation of X-chromosome inactivation in development in mice and humans. Microbiol. Mol. Biol. Rev. 62, 362–378 (1998).

    CAS  PubMed  PubMed Central  Google Scholar 

  108. Sasaki, Y. T., Ideue, T., Sano, M., Mituyama, T. & Hirose, T. MENɛ/β noncoding RNAs are essential for structural integrity of nuclear paraspeckles. Proc. Natl Acad. Sci. USA 106, 2525–2530 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Cornelis, G., Souquere, S., Vernochet, C., Heidmann, T. & Pierron, G. Functional conservation of the lncRNA NEAT1 in the ancestrally diverged marsupial lineage: evidence for NEAT1 expression and associated paraspeckle assembly during late gestation in the opossum Monodelphis domestica. RNA Biol. http://dx.doi.org/10.1080/15476286.2016.1197482 (2016).

  110. Ounzain, S. et al. CARMEN, a human super enhancer-associated long noncoding RNA controlling cardiac specification, differentiation and homeostasis. J. Mol. Cell Cardiol. 89, 98–112 (2015).

    Article  CAS  PubMed  Google Scholar 

  111. Rinn, J. L. et al. Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs. Cell 129, 1311–1323 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  112. Schorderet, P. & Duboule, D. Structural and functional differences in the long non-coding RNA Hotair in mouse and human. PLoS Genet. 7, e1002071 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Li, L. et al. Targeted disruption of Hotair leads to homeotic transformation and gene derepression. Cell Rep. 5, 3–12 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  114. Klattenhoff, C. A. et al. Braveheart, a long noncoding RNA required for cardiovascular lineage commitment. Cell 152, 570–583 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Maamar, H. & Cabili, M. N., Rinn, J. & Raj, A. linc-HOXA1 is a noncoding RNA that represses Hoxa1 transcription in cis. Genes Dev. 27, 1260–1271 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Lipovich, L. et al. Activity-dependent human brain coding/noncoding gene regulatory networks. Genetics 192, 1133–1148 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Durruthy-Durruthy, J. et al. The primate-specific noncoding RNA HPAT5 regulates pluripotency during human preimplantation development and nuclear reprogramming. Nat. Genet. 48, 44–52 (2016).

    Article  CAS  PubMed  Google Scholar 

  118. Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  122. Fagerberg, L. et al. Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics. Mol. Cell Proteom. 13, 397–406 (2014).

    Article  CAS  Google Scholar 

  123. Lerch, J. K. et al. Isoform diversity and regulation in peripheral and central neurons revealed through RNA-seq. PLoS One 7, e30417 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Pollard, K. S., Hubisz, M. J., Rosenbloom, K. R. & Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 20, 110–121 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  125. Schwartz, M. P. et al. Human pluripotent stem cell-derived neural constructs for predicting neural toxicity. Proc. Natl Acad. Sci. USA 112, 12516–12521 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  126. Bergmann, J. H. et al. Regulation of the ESC transcriptome by nuclear long noncoding RNAs. Genome Res. 25, 1336–1346 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  127. Migeon, B. R. et al. Human X inactivation center induces random X chromosome inactivation in male transgenic mice. Genomics 59, 113–121 (1999).

    Article  CAS  PubMed  Google Scholar 

  128. Heard, E. et al. Human XIST yeast artificial chromosome transgenes show partial X inactivation center function in mouse embryonic stem cells. Proc. Natl Acad. Sci. USA 96, 6841–6846 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  129. Kurian, L. et al. Identification of novel long noncoding RNAs underlying vertebrate cardiovascular development. Circulation 131, 1278–1290 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  130. Gong, C. et al. A long non-coding RNA, LncMyoD, regulates skeletal muscle differentiation by blocking IMP2-mediated mRNA translation. Dev. Cell 34, 181–191 (2015).

    Article  CAS  PubMed  Google Scholar 

  131. Wang, Y. et al. Arabidopsis noncoding RNA mediates control of photomorphogenesis by red light. Proc. Natl Acad. Sci. USA 111, 10359–10364 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  132. Grant, J. et al. Rsx is a metatherian RNA with Xist-like properties in X-chromosome inactivation. Nature 487, 254–258 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  133. Kok, F. O. et al. Reverse genetic screening reveals poor correlation between morpholino-induced and mutant phenotypes in zebrafish. Dev. Cell 32, 97–108 (2015).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The author thanks A. Shkumatava, A. Mallory, M. Garber, E. Hornstein, H. Hezroni and N. Gil for discussions and comments on the manuscript. I.U. is the Sygnet Career Development Chair for Bioinformatics and recipient of an Alon Fellowship from The Council for Higher Education of Israel. Work in the Ulitsky laboratory is supported by grants to I.U. from the European Research Council (Project lincSAFARI), the Israeli Science Foundation (1242/14 and 1984/14), the Israeli Centers of Research Excellence (I-CORE) Program of the Planning and Budgeting Committee and The Israel Science Foundation (1796/12), the Minerva Foundation, the Fritz-Thyssen Foundation and by research grants from Lapon Raymond and the Abramson Family Center for Young Scientists.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Igor Ulitsky.

Ethics declarations

Competing interests

The author declares no competing financial interests.

Related links

PowerPoint slides

Glossary

Expressed sequence tag

(EST). Typically 3′-biased Sanger-sequencing read of approximately 700 nucleotides.

Full-length cDNA

A cDNA that ideally captures a full-length mRNA transcript from the 5′ cap to the 3′ polyadenylated tail; sequenced by multiple Sanger sequencing runs.

Homologues

A pair of genes that descended from a common ancestral gene.

Purifying selection

(Also called negative selection). Selective removal of deleterious alleles.

Effective population size

The size of an idealized population that would experience genetic drift in a similar way to the actual population.

Triplex

An RNA structure formed by three strands of RNA, two that form a Watson–Crick duplex and a third that binds in the major groove of the duplex forming Hoogsteen and reverse Hoogsteen hydrogen bonds.

Syntenic

Preserving order and orientation of genes or other genomic elements between species.

Orthologous

Pertains to homologous genes in different species that have evolved from a common ancestral gene by speciation.

Trans-acting

Regulation that is not cis acting; for example, regulation by diffusible factors that can comparably regulate both homologous loci in a diploid organism.

Cis-acting

Acting from the same molecule, typically interpreted as regulation occurring on the same physical chromosome.

Paralogues

Homologous genes related by duplication within a genome.

Nonsense mutations

Mutations in which a codon encoding an amino acid is mutated into a stop codon.

Exaptation

Co-option of a functionally unrelated DNA sequence for a novel function.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ulitsky, I. Evolution to the rescue: using comparative genomics to understand long non-coding RNAs. Nat Rev Genet 17, 601–614 (2016). https://doi.org/10.1038/nrg.2016.85

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nrg.2016.85

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing