To determine the amounts and structures of the noncanonical bovine coronavirus (BCoV) transcripts in HRT-18 cells, nanopore direct RNA sequencing was employed. Quantification by read counts revealed that ~ 30% of the total cellular RNA was BCoV transcripts (Fig.
1B). The BCoV transcripts consisted of one or more genome fragments (Fig.
1C), and the gene sequences of the fragments were identical to those from different portion(s) of the full-length genome. In addition, the BCoV transcripts could be divided into two main categories, canonical transcripts (Fig.
1E) and noncanonical transcripts (Fig.
1F), with various quantities (Fig.
1D), based on the difference in sequence elements and the potential synthesis mechanisms. The canonical transcripts were well-established canonical coronavirus sgmRNAs (Fig.
1E) with a leader sequence derived from well-defined canonical TRS-Bs (cTRS, Fig.
1A) located immediately upstream of each structural and accessory protein-encoding gene and are thus defined as canonical sgmRNAs (Fig.
1E). Accordingly, TRS-Bs, which shared sequence homology with canonical TRS-Bs but were not located immediately upstream of each structural and accessory protein-encoding gene, are defined as noncanonical TRS-Bs (ncTRS, Fig.
1A). In addition, it has been suggested that during coronavirus replication, a defective viral genome (DVG), which is a truncated version of the genome, can be synthesized irrespective of TRSs [
12]. Consequently, based on whether the structures of noncanonical transcripts are relevant to TRSs, noncanonical transcripts were categorized into 2 subcategories: noncanonical sgmRNAs and DVGs (Fig.
1F). The method used for this classification is explained in Additional file
1: Figure S1 and the associated figure legend. In the noncanonical sgmRNA subcategory (Fig.
1F, upper panel), based on whether the coronavirus sgmRNAs had or did not have a leader and whether they were derived from canonical or noncanonical TRS-Bs, the sgmRNAs were further divided into three populations, including sgmRNAs without a leader but with a 5′ sequence identical to the flanking sequence of a canonical TRS-B (1. ΔL_cTRS_sgm), sgmRNAs with a leader but derived from a noncanonical TRS-B (2. L_ncTRS_sgm) and sgmRNAs without a leader but with a 5′ sequence identical to the flanking sequence of a noncanonical TRS-B (3. ΔL_ncTRS_sgm). In the DVG subcategory [
25] (Fig.
1F, lower panel), DVGs were further divided into four populations based on the presence of sequence elements from 3′ UTR or/and 5′ UTR, including DVGs with sequence elements from 3′ UTR and 5′ UTR (1. 5′3′DVG), DVGs with a sequence element from 5′ UTR (2. Δ3′DVG), DVGs with a sequence element from 3′ UTR (3. Δ5′DVG) and DVGs lacking sequence elements from 3′ UTR and 5′ UTR (4. Δ5′3′DVG). Consequently, based on the classification, the RNA transcripts consisting of 1 fragment from one part of the genome shown in Fig.
1C included 5′3′DVG Δ5′DVG, Δ3′DVG, Δ5′3′DVG, ΔL_cTRS_sgm and ΔL_ncTRS_sgm; the RNA transcripts consisting of 2 fragments from two different parts of the genome included 5′3′DVG, Δ5′DVG, Δ3′DVG, Δ5′3′DVG, canonical sgmRNA and L_ncTRS_sgm; the RNA transcripts consisting of more than 2 fragments (3, 4, 5 and ≥ 6 in Fig.
1C) from different parts of the genome included 5′3′DVG, Δ5′DVG, Δ3′DVG and Δ5′3′DVG.
Based on the results derived from nanopore direct RNA sequencing, it is concluded that (i) in addition to the previously well-defined genomes (Fig.
1A) and canonical sgmRNAs (Fig.
1E), noncanonical transcripts are also synthesized (Fig.
1F) at high abundance (Fig.
1D), and (ii) noncanonical transcripts can be further divided into 2 subcategories: noncanonical sgmRNAs (3 populations, Fig.
1F, upper panel) and DVGs (4 populations, Fig.
1F, lower panel).