Background
Principle for novel ncRNA discovery
Chromatin signatures for novel ncRNA discovery
Principles for evaluating coding potential
Classification | Techniques | Short description | Strengths of the approach | Weakness | Ref |
---|---|---|---|---|---|
Microarrays | Tiling arrays | A method based on probes for discovering transcripts from specific genomic regions. | This approach can provide in-depth analysis of transcripts from target regions of genome. | Suffer from potential noise as a result of weak binding or cross-hybridization of transcripts to probes. | [56] |
Microarrays | A method based on a large number of oligonucleotide probes for performing quick global or parallel expression analysis of transcriptome. | Small size and high-throughput capabilities. | This method is not able to discover novel transcripts. | [57] | |
RNA-seq | RNA-seq | A technique that is currently the most widespread sequencing technology for both detecting RNA expression and discovering novel RNAs. | The method provides a global high-throughput detection amd identification of RNAs greater than 200 nt. | Its standard procedure is not suitable for detection of RNAs less than 200 nt. It also suffer from sequence errors at the reverse-transcription step or primer bias. | [58] |
RNA capture sequencing | A derivative technology combining RNA-seq with tilling arrays. | The method can specifically elevate the sequencing depth of target regions. | Suffer from disadvantages of both tiling arrays and RNA-seq. | [59] | |
scRNA-seq | Smart-seq | A scRNA-seq method based on a full-length cDNA amplification strategy. | Provide a full-length cDNA amplification of polyadenylated RNAs. | The limitations are lack of strand-specific identification, inability to read transcripts longer than 4 kb and only for polyadenylated RNAs. | [60] |
DP-seq | A scRNA-seq method using heptamer primers. | Suitable for smaller size samples or transcripts longer than 4 kb. this approach also suppresses highly expressed rRNAs in the cDNA library. | Captured RNAs are limited to polyadenylated RNAs. | [61] | |
Quartz-seq | A scRNA-seq method which reduces back ground noise. | Reduce background noise by using specially suppression PCR primers to reduce side products. | The method is limited to detecting polyadenylated RNAs. | [62] | |
SUPeR-seq | A single-cell universal polyadenylated tail-independent RNA sequencing. | Detect polyadenylated and nonpolyadenylated RNAs. Minimal rRNAs contamination. | Relatively low sensitivity for nonpolyadenylated RNAs. | [63] | |
RamDA-seq | A full-length total RNA-sequencing method for analyzing single cells. | High sensitivity for nonpolyadenylated RNAs. It can also uncover the dynamics of recursive splicing. | Unknown | [64] | |
Small RNA-seq | Small RNA-seq | A type of RNA-seq that discriminate small RNA from larger RNA to better evaluate and discover novel small RNAs. | Specifically detect and discover small or intermediate-sized RNAs with target sizes. | Adapter ligation bias lead to reverse transcription bias or amplification bias. | [65] |
Single-cell small-RNA sequencing | Small-seq | A method which detect small RNAs in a single cell. | The method can detect small RNAs in a single cell. | The limination may be similar to small RNA-seq. | [66] |
Nascent RNA-seq | GRO-seq | A method labeling nascent RNAs with 5Br-UTP and immunoprecipitating RNAs for sequencing. | Detect nascent RNAs and provide a genome-wide view of the location, orientation, and density of Pol II-engaged transcripts. | The method is confounded by contamination due to nonspecific binding, which could possibly result in experimental bias. | [67] |
SLAM-seq | A method distinguishing nascent RNA from total RNA via s4U-to-C conversion induced by nucleophilic substitution chemistry. | It is an enrichment-free method which can avoid contamination induced by affinity purification. | The oxidation condition caused certain oxidative damage to guanine, which may impact the accurancy of sequencing. | [68] | |
TimeLapse-seq | A method distinguishing nascent RNA from total RNA via s4U-to-C conversion induced by an oxidative nucleophilic aromatic substitution reaction. | It is an enrichment-free method which can avoid contamination induced by affinity purification. | The oxidation condition caused certain oxidative damage to guanine, which may impact the accurancy of sequencing. | [69] | |
AMUC-seq | A method distinguishing nascent RNA from total RNA via transforming s4U into a cytidine derivative using acrylonitrile. | More efficient and reliable because it has a minimal influence on the base-pairing manner of other nucleosides. | Unknown | [70] | |
Identification of RNA-chromatin interaction | GRID-seq | A method that aims to comprehensively detect and determine the localization of all potential chromatin-interacting RNAs. | Use a bivalent linker to ligate RNA to DNA in situ and provide exact profiles of RNA-chromatin interactome. | Usable sequence length for mapping RNA is 18–23 bp. However, short sequence length can result in ambiguity in mapping. | [71] |
iMARGI | A method providing a in situ mapping of RNA-genome interactome. | iMARGI needs less number of input cells and is suitable for paired-end sequencing. | Unknown | [72] | |
ChAR-seq | A chromatin-associated RNA sequencing that maps genome-wide RNA-to-DNA contacts. | Uncover chromosome-specific dosage compensation ncRNAs, and genome-wide trans-associated RNAs. | The method needs more than 100 million input cells. | [73] | |
Identification of RNA-RNA interaction | CLASH | A relatively early method that uses UV cross-linking to capture direct RNA-RNA hybridization. | Avoid noise from protein intermediate-mediated interactions. | This method only detects the RNA-RNA interactions base on proteins. | [74] |
RIPPLiT | A transcriptome-wide method for probing the 3D conformations of RNAs stably associated with defined proteins. | The method can capture 3D RNP structural information independent of base pairing. | This method only detects the RNA-RNA interactions base on proteins. | [75] | |
MARIO | A method identifying RNA-RNA interactions in the vicinity of all RNA-binding proteins using a biotin-linked reagent. | This method can identify RNA-RNA interactions in the vicinity of all RNA-binding proteins. | The method only detects the RNA-RNA interactions base on proteins. | [76] | |
PARIS | Psoralen analysis of RNA interactions and structures with high throughput and resolution. | Directly measure RNA-RNA interactions independent of proteins in living cells. | Unknown | [77] | |
LIGR-seq | A method for the global-scale mapping RNA-RNA interactions in vivo. | Provide global-scale mapping RNA-RNA interactions independent of proteins in vivo | Unknown | [78] | |
SPLASH | A method providing pairwise RNA-RNA partnering information genome-wide. | Map pairwise RNA interactions in vivo with high sensitivity and specificity, genome-wide. | Unknown | [79] | |
RIC-seq | RNA in situ conformation sequencing technology for the global mapping of intra- and intermolecular RNA-RNA interactions. | The method performs RNA proximity ligation in situ and can facilitate the generation of 3D RNA interaction maps. | Unknown | [80] | |
RNA proximity sequencing | A method based on massive-throughput RNA barcoding of particles in water-in-oil emulsion droplets. | This method can detect multiple RNAs in proximity to each other without ligation and is fit for studying the spatial organization of RNAs in the nucleus. | Unknown | [81] | |
RNAs in protein complexes or subcellular structures | FISSEQ | A method that offers in situ information of RNAs at high-throughput levels. | Provide information of RNAs at high-throughput levels. Visualization. | Unknown | [82] |
CeFra-seq | A method that physically isolates subcellular compartments and identifies their RNAs. | The methods have high sensitivity for low-abundance transcripts. | The method is limited to isolation protocols and the purity of resulting isolates. | [83] | |
APEX-RIP | A method can map organelle-associated RNAs in living cells via proximity biotinylation combined with protein-RNA crosslinking. | The technique can offer high specificity and sensitivity in targeting the transcriptome of membrane-bound organelles. | Unknown | [84] |
Characteristics of known ncRNAs
Principle and strategy for identification of novel ncRNAs
Approaches for discovering ncRNAs
Tiling arrays and microarrays
RNA-seq
Small RNA-seq and single-cell small-RNA sequencing
Single-cell RNA sequencing (scRNA-seq)
Nascent RNA-seq
Innovative techniques based on RNA location and interactome for functional ncRNA discovery
RNA-chromatin interaction
RNA-RNA spatial interactions
RNAs in protein complexes or subcellular structures
NcRNA database
Cancer or basis | Database | Species | Website | Short description | Ref |
---|---|---|---|---|---|
Cancer | Lnc2Cancer v2.0 | lncRNA | An updated database that provides comprehensive experimentally supported associations between lncRNAs and human cancers. | [180] | |
TANRIC | lncRNA | This database characterizes the expression profiles of lncRNAs in large patient cohorts of 20 cancer types, including TCGA and independent datasets (> 8000 samples overall). | [179] | ||
lnCaNet | lncRNA | This database provides a comprehensive co-expression data resource which reveals the interactions between lncRNA and non-neighbouring cancer genes. | [181] | ||
LncRNADisease 2.0 | lncRNA | A database integrating comprehensive experimentally supported and predicted lncRNA-disease associations. | [182] | ||
The Cancer LncRNome Atlas | lncRNA | An academic research database to explore the lncRNA alternations across multiple human cancer types. | [194] | ||
SELER | lncRNA | A database of super-enhancer-associated lncRNA-directed transcriptional regulation in human cancers. | [195] | ||
CSCD | circRNA | A database that focuses on distinguishing cancer-specific circRNAs from noncancerous circRNAs, and reports predicted cellular location, RBP sites, and ORFs. | [183] | ||
Circ2Traits | circRNA | Provide cirRNA-disease association based on the interaction of circRNAs with disease-related miRNAs and SNP mapped on circRNA loci. | [184] | ||
CircR2Disease | circRNA | Provide a comprehensive resource for circRNA deregulation in various diseases, containing 725 associations between 661 circRNAs and 100 diseases. | [185] | ||
CircRNA disease | circRNA | A manually curated database of experimentally supported circRNA-disease associations. | [196] | ||
MiOncoCirc | circRNA | circRNA detection in 2093 clinical human cancer samples using exome capture sequencing. | [119] | ||
CircRiC | circRNA | A database focusing on lineage-specific circRNAs in 935 cancer cell lines including drug response. | [197] | ||
miRCancer | miRNA | A database currently documents more than 9000 relationships between 57,984 miRNAs and 196 human cancers. | [186] | ||
SomamiR 2.0 | miRNA | A database of cancer somatic mutations in microRNAs (miRNA) and their target sites that potentially alter the interactions between miRNAs and competing endogenous RNAs (ceRNA). | [187] | ||
OncomiR | miRNA | An online resource for exploring miRNA dysregulation in cancer. | [188] | ||
miRCancerdb | miRNA | An easy-to-use database to investigate the microRNAs-dependent regulation of target genes involved in development of cancer. | [189] | ||
miR2Disease | miRNA | A database aiming at providing a comprehensive resource of microRNA deregulation in various human diseases. | [198] | ||
YM500v3 | small ncRNA | A database which contains more than 8000 small RNA-seq dataseta and focuses on piRNAs, tRFs, snRNAs, snoRNAs, and miRNAs. | [191] | ||
tRF2Cancer | small ncRNA | A web server to detect tRFs and their expression in multiple cancers. | [192] | ||
MINTbase v2.0 | Small ncRNA | A framework for the interactive exploration of mitochondrial and nuclear tRNA fragments. | [193] | ||
Basis | LNCipedia | lncRNA | A public database for lncRNA sequence and annotation. | [199] | |
LNCediting | lncRNA | This database provides a comprehensive resource for the functional prediction of RNA editing in lncRNAs. | [200] | ||
lncRNAdb v2.0 | lncRNA | This database provides comprehensive annotations of eukaryotic lncRNAs. | [201] | ||
LncRNAWiki | lncRNA | This database is a publicly editable and open-content platform for community curation of human lncRNAs. | [202] | ||
LncBook | lncRNA | This database is a curated knowledgebase of human lncRNAs. | [203] | ||
MONOCLdb | lncRNA | 20,728 mouse lncRNA genes. | [204] | ||
NONCODE | lncRNA | An interactive database that aims to present the most complete collection and annotation of ncRNAs especially lncRNAs from 17 species. | [205] | ||
CircAtlas | circRNA | An integrated resource of one million highly accurate circular RNAs from 1070 vertebrate transcriptomes. | [206] | ||
circBase | circRNA | A database containing thousands of recently identified circRNAs in eukaryotic cells. | [207] | ||
CIRCpedia v2 | circRNA | A database for comprehensive circRNA annotation from over 180 RNA-seq datasets across six different species. | [208] | ||
TSCD | circRNA | A tissue-specific circRNA database from RNA-seq datasets and characterized the features of circRNAs in human and mouse. | [209] | ||
starBase v2.0 | miRNA | A database decoding miRNA-ceRNA, miRNA-ncRNA, and protein–RNA interaction networks from large-scale CLIP-Seq data. | [210] | ||
miRTarBase | miRNA | A resource for experimentally validated microRNA-target interactions. | [211] | ||
miRmine | miRNA | A database of human miRNA expression profiles. | [212] | ||
EVmiRNA | miRNA | A database focusing on miRNA expression profiles in extracellular vesicles. | [213] | ||
miRGate | miRNA | A curated database of human, mouse, and rat miRNA–mRNA targets. | [214] | ||
miRBase | miRNA | A database containing microRNA sequences from 271 organisms: 38,589 hairpin precursors and 48,860 mature microRNAs. | [215] | ||
DIANA-TarBase v8 | miRNA | A reference database devoted to the indexing of experimentally supported miRNA targets. | [216] | ||
DASHR 2.0 | small ncRNA | A database that integrates human small ncRNA gene and mature products derived from all major RNA classes. | [217] |
Application of cancer-related ncRNA identification for diagnosis
Species | Name | Expression in cancer | Diseases | Application | Patent number |
---|---|---|---|---|---|
circRNA | hsacirc_0028185 | Up | Hepatocellular carcinoma | Cancer auxiliary diagnosis | CN111004850A (2020) |
circRNA | hsa_circ_001477 | Up | Gastric cancer | Cancer diagnosis | CN110129324A (2019) |
circRNA | hsa_circRNA_012515 | Up | Non-small cell lung cancer | Cancer diagnosis | CN110592223A (2019) |
circRNA | hsa_circRNA_405124 or hsa_circ_0012152 | Up | Leukemia | Cancer early diagnosis | CN109593859A (2019) |
circRNA | circ_104075 | Up | Liver cancer | Cancer diagnosis | CN109161595A (2019) |
circRNA | circ3823 | Up | Colorectal cancer | Cancer early diagnosis | CN110592220A (2019) |
circRNA | hsa_circ_0021977 | Up | Breast cancer | Cancer diagnosis | CN109022583A (2018) |
circRNA | hsa_circ_0012755 | Up | Prostate cancer | Cancer diagnosis | CN108624688A (2018) |
circRNA | circ_0047921, circ_0007761 and circ_0056285 | Up | Non-small cell lung cancer | Cancer early diagnosis | CN108179190A (2018) |
circRNA | hsa-circRPL15-001 | Up | Chronic lymphocytic leukemia | Cancer diagnosis | CN109055564A (2018) |
circRNA | has_circ_0117909 | Up | Acute lymphoblastic leukemia | Cancer diagnosis | CN107937522A (2017) |
has_circ_0005720 | Down | ||||
circRNA | cRNA-ZFR | Up | Bladder cancer | Cancer diagnosis | CN106011139A (2016) |
lncRNA | lncRNA-AC006159.3 | Down | Colorectal cancer | Cetuximab-resistance diagnosis | CN108949993A (2018) |
lncRNA | lncRNAXLOC_004122, Linc00467 and lncRNAA1049452 | Up | Breast cancer | Cancer bone metastasis diagnosis | CN107699619A (2017) |
lncRNA | LncRNA GENE NO.9 | Up | Bladder cancer | Cancer diagnosis | CN107267636A (2017) |
lncRNA | LINC00516 | Up | Lung cancer | Cancer or cancer metastasis diagnosis | CN108998528A (2018) |
lncRNA | LSAMP-AS1 | Up | Gastric cancer | Cancer diagnosis | CN110628915A (2019) |
miRNA | miRNA-4692 | Down | Hepatocellular carcinoma | Cancer diagnosis | (2018) |
miRNA | miRNA-1266 | Up | Endometrial carcinoma | Cancer diagnosis | CN105907883A (2016) |
miRNA | miR-320 | Down | Cervical cancer | Cancer early diagnosis | CN105506076A (2016) |
miRNA | miRNA-2116 | Up | Lung adenocarcinoma | Cancer metastasis diagnosis | CN104774966A (2015) |
miRNA | miRNA-410 | Up | Prostate cancer | Cancer diagnosis | CN104651492A (2015) |
miRNA | miRNA-1262 | Up | Acute myeloid leukemia | Cancer diagnosis | CN105063052A (2015) |