An update on LNCipedia: a database for annotated human lncRNA sequences

Volders, P.J.; Verheggen, K.; Menschaert, G.; Vandepoele, K.; Martens, L.; Vandesompele, J.; Mestdagh, P.

doi:10.1093/nar/gkv295

Nucleic Acids Res. 2015 Jan;43(Database issue):D174–80. doi: 10.1093/nar/gku1060.

LNCipedia collects long non-coding RNA sequences and annotation from different sources. In version 3.0, over 90,000 new transcripts were added to the database. 6917 of these transcripts were obtained from RefSeq by filtering for accession prefix (NR_) and size (200bp). This filtering strategy however, does not confine to long non-coding RNAs and also yields transcripts associated with protein coding genes. Transcripts with incomplete open reading frames that are subject to nonsense-mediated mRNA decay for instance are also annotated with accession prefix NR_. These transcripts are generally not considered as true lncRNAs and typically exhibit a high coding potential score when assessed by PhyloCSF. The authors therefore chose to exclude these transcripts from the database and confine their analysis to the RefSeq subset with keyword biomol_ncrna_lncrna as suggested by RefSeq's Dr. Kimm D. Pruit. This change is reflected in LNCipedia.org update 3.1 and this corrigendum serves to elucidate the discrepancies in the article caused by this update.

LNCipedia.org has been updated to version 3.1. As a consequence, the following results need to be adjusted (new values are shown in bold, original values are in italics).

- Under Protein-coding potential

‘When applying our pre-computed cutoff, these transcripts add up to about 25% (26%) of the collection’

- Under HIGH-CONFIDENCE SET

‘3406 (4127) lncRNA transcripts containing at least one TIS are thus withdrawn’

‘As such, 26 633 (27 293) transcripts with a PhyloCSF score higher than 41 are discarded. Finally, the 1624 (2040) PSM containing transcripts from the PRIDE reprocessing pipeline are excluded as well. The resulting set of 79 769 (80 216) transcripts (71% of LNCipedia 3.1) representing 47 877 (48 028) genes (77% (76%)) is referred to as “high-confidence set” and is available for download on the LNCipedia website’

- Table 1

Source	Version	Number of transcripts
Ensembl (52)	75	23 498
Refseq (43)	December (March) 2014	4774 (6917)
Nielsen et al., 2014 (45)		7656
Hangauer et al., 2013 (46)		5339
NONCODE (44)	4	93 164
LNCipedia (41)	1.0	21 504
Total number of unique transcripts		111 685 (113 513)

Source	Version	Number of transcripts
Ensembl (52)	75	23 498
Refseq (43)	December (March) 2014	4774 (6917)
Nielsen et al., 2014 (45)		7656
Hangauer et al., 2013 (46)		5339
NONCODE (44)	4	93 164
LNCipedia (41)	1.0	21 504
Total number of unique transcripts		111 685 (113 513)

Open in new tab

Source	Version	Number of transcripts
Ensembl (52)	75	23 498
Refseq (43)	December (March) 2014	4774 (6917)
Nielsen et al., 2014 (45)		7656
Hangauer et al., 2013 (46)		5339
NONCODE (44)	4	93 164
LNCipedia (41)	1.0	21 504
Total number of unique transcripts		111 685 (113 513)

Source	Version	Number of transcripts
Ensembl (52)	75	23 498
Refseq (43)	December (March) 2014	4774 (6917)
Nielsen et al., 2014 (45)		7656
Hangauer et al., 2013 (46)		5339
NONCODE (44)	4	93 164
LNCipedia (41)	1.0	21 504
Total number of unique transcripts		111 685 (113 513)

Open in new tab

‘Overview of data sources contributing to lncRNA content in LNCipedia 3.1 (3.0). In the case of RefSeq, only entries with property “biomol_ncrna_lncrna” were considered’

The following Figures 1 and 4 should be considered in place of the original figures:

Figure 1.

Open in new tab Download slide

LNCipedia has grown substantially since its first release. The first version (41) was based on sequences and annotation from three different sources and was made available to the public in 2012. For the 2013 release of LNCipedia (unpublished), no additional sources were used, but the different sources were updated to the most recent version. For version 3.1 (3.0) of LNCipedia, both new sources were added and existing sources were updated.

Figure 4.

Open in new tab Download slide

Transcripts with a likely coding potential are removed in the definition of a high-confidence set. Transcripts containing small open reading frames (25), translation initiation sites (24), PhyloCSF score greater than 41 or PSMs with an identification confidence higher than 90% are excluded.

The conclusions of the article are not affected and remain valid. The Authors apologise to Readers for these errors and inconvenience caused.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

Download all slides

Month:	Total Views:
November 2016	1
December 2016	3
January 2017	4
February 2017	11
March 2017	11
April 2017	1
May 2017	3
June 2017	7
July 2017	7
August 2017	10
September 2017	5
October 2017	5
November 2017	8
December 2017	26
January 2018	18
February 2018	22
March 2018	27
April 2018	15
May 2018	15
June 2018	11
July 2018	16
August 2018	24
September 2018	18
October 2018	13
November 2018	5
December 2018	9
January 2019	20
February 2019	15
March 2019	17
April 2019	26
May 2019	29
June 2019	25
July 2019	31
August 2019	18
September 2019	15
October 2019	13
November 2019	5
December 2019	17
January 2020	15
February 2020	20
March 2020	12
April 2020	6
May 2020	12
June 2020	7
July 2020	7
August 2020	10
September 2020	26
October 2020	7
November 2020	11
December 2020	5
January 2021	2
February 2021	7
March 2021	12
April 2021	10
May 2021	2
June 2021	6
July 2021	3
August 2021	11
September 2021	6
October 2021	15
November 2021	3
December 2021	8
January 2022	7
February 2022	4
March 2022	17
April 2022	8
May 2022	2
June 2022	3
July 2022	16
August 2022	21
September 2022	14
October 2022	9
November 2022	14
December 2022	11
January 2023	19
February 2023	10
March 2023	11
April 2023	9
May 2023	13
June 2023	16
July 2023	12
August 2023	14
September 2023	12
October 2023	10
November 2023	6
December 2023	10
January 2024	13
February 2024	8
March 2024	11
April 2024	3

Article Contents

An update on LNCipedia: a database for annotated human lncRNA sequences

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Article Contents

An update on LNCipedia: a database for annotated human lncRNA sequences

Comments

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only