Viruses are nothing but small protein capsules enclosing the simple genetic material. For certain ones, the splicing machinery is hijacked to produce viral proteins and maintain the lifecycle of the virus. Human immunodeficiency virus type 1 (HIV-1), the etiologic agent of acquired immunodeficiency disease syndrome (AIDS), has always been the subject to study the important role of AS in its life cycle. HIV-I usurps host splicing mechanism to generate over 40 different spliced mRNA from a single full-length unspliced primary transcript, which are further translated into diverse products including structural proteins and regulatory factors [
19]. The sophisticated process depends on the cooperation of multiple positive and negative factors, such as
cis-regulatory elements in HIV-1 RNA and
trans-acting cellular and viral proteins. Over the past several decades, considerable progress has been made in understanding the mechanisms of how HIV-1 regulates its RNA splicing. Since high mutation frequency of HIV-1 RNA resulting in drug resistance, antiviral strategies targeting HIV-1 splicing has become the promising therapy to curb AIDS [
20]. Apart from HIV-1, many other viruses depend on AS to complete its lifecycle. Here, we take some typical viruses as examples to show how the AS is usurped to maximize the coding potential of viral genome (Table
1).
Table 1
Types of AS have been identified in AdV, HPV, and IAV
AdV | E1A RNA | Alternative 5’ splice site and alternative 3' splice site | Celluar SR proteins | Generate 9S, 10S, 11S, 12S, and 13S viral mRNA |
| L1 RNA | Alternative 3’ splice site | Cellular SR proteins and viral E4-ORF; viral cis-acting elements, 3RE and 3VDE | Generate 52/55 K and IIIa viral mRNA |
HPV | E6/E7 RNA | Alternative 3’ splice site | Cellular hnRNP and SR proteins; viral cis-acting elements SA409 | Generate E6 and E7 mRNA |
IAV | M RNA | Alternative 5’ splice site | Cellular SR proteins, hnRNP protein family and NS1-BP, and viral polymerase complex as well as NS1 | Generate M2 mRNA |
| NS RNA | Alternative 5’ splice site | Viral NS1 | Generate NS2 mRNA |
Adenovirus
Adenovirus (AdV) genome is compact with rarely redundant nucleotides that are not transcribed or serve regulatory functions. Generally, it could be divided into different transcription units based on their expressive phase: the early (E1-E4) and the late genes (L1-L5) [
21]. Thereinto, AdV E1 pre-mRNA has been well-known to undergo splicing by using five 5’ splice sites and one common 3’ splice site to excise introns from 1112 to 1225 nucleotides (nt), from 974 to 1225 nt, from 637 to 852 nt and 1112 to 1225 nt, from 637 to 852 nt and 974 to 1225 nt, or from 637 to 1225 nt, respectively, to generate 13S, 12S, 11S, 10S, or 9S mRNA, respectively [
22,
23]. During lytic infection, 13S and 9S forms are the most abundant in early and late phases, respectively, and the shift from 13 to 9S is dependent on the SR splicing factors in an appropriate ionic condition [
24,
25]. Notably, AS of E1A pre-mRNA is highly sensitive to changes in various parameters, therefore, it has successfully been used as a model substrate to characterize the function of SR proteins [
26]. Studies have demonstrated that ASF/SF2 (especially its second RNA binding domain) and SC35 enhance proximal 13S mRNA splicing [
27,
28], SRp20 enhances 12S mRNA splicing [
29], and SRp54 enhances 9S mRNA splicing [
30]. The distinct trans-acting properties of SR proteins might due to their different binding ability between arginine/serine-rich domains with U1 snRNP [
31].
Besides early genes, the adenoviral major late transcription unit (MLTU) is sophisticatedly manipulated by AS to generate approximately 20 mRNAs. The MLTU produces a primary transcript of ~ 28,000 nt, which becomes polyadenylated at one of five positions (L1-L5 mRNA families) with co-terminal 3’-ends. Thereinto, L1 is an alternative spliced gene where the last intron is spliced using a common 5’ splice site and two competing 3’ splice sites (11,040 nt and 12,308 nt) to generate two cytoplasmic mRNAs, the 52/55 K and the IIIa, respectively [
32,
33]. 52/55 K is indispensable for viral genome encapsulation [
34], and IIIa protein serves its best characterized function as a structural protein in the capsid [
35]. Intriguingly, proximal 3’ splice site located at 11,040 nt is activated in the early phase of infection, resulting in an exclusive production of 52/55 K. However, the distal 3’ splice site mapped at 12,308 nt becomes active to generate almost equal amount of 52/55 K and IIIa in the late time [
36,
37]. Further study found that IIIa splicing is tightly controlled by two
cis-acting viral elements, the 49 nt IIIa repressor element (3RE) and the 28 nt IIIa virus-infection dependent splicing enhancer (3VDE). The 3RE binds the hyper-phosphorylated form of SR proteins to inhibit the spliceosome assembly on the IIIa 3’ splice site [
38,
39], therefore blocking IIIa expression in the early infectious stage. This inhibition is released by viral E4-ORF4 through inducing SR proteins dephosphorylation in order to recruit U2 snRNP binding the branch point [
40]. The other element 3VDE, consisting of the IIIa branch point sequence, pyrimidine tract, and AG dinucleotide, is necessary to activate IIIa splicing in the AdV-infected HeLa-NE. Although 3VDE takes effect through an U2AF-independent manner, the L4-33 K has been identified as an AdV-encoded alternative RNA splicing factor to active IIIa expression [
41,
42]. These results indicate that virus could not only “steal” but plant “inside man” within splicing machinery to regulate viral protein expression.
Apart from the regulatory factors mentioned above, notably, RNA modification and dsRNA production play pivotal roles in efficient splicing of AdV RNAs. N6-methyladenosine (m
6A), the most prevalent modification in cellular RNAs, has been found in early and late adenoviral transcripts [
43]. Depletion of m
6A writer methyltransferase like 3 (METTL3) specifically impacts viral late transcripts by reducing their splicing efficiency, and this biased-effect could be extended to all the multiply spliced AdV late RNAs [
44]. Moreover, AdV mutants lacking virus-directed ubiquitin ligase activity, but not wildtype ones, produce abundant dsRNA within the nucleus of infected cells, leading to form intron/exon base pairs between top and bottom strand transcripts. Consequently, cytoplasmic dsRNA sensor PKR is translocated to the nucleus, igniting host innate immune response and blocking AS of viral RNAs [
45]. Therefore, m
6A modification and preventing dsRNA formation are necessary for avoiding restriction by host immune sensors and promoting efficient splicing of viral RNAs.
Human papillomavirus
Similarly, Human papillomavirus (HPV) genome could be divided into exclusively early genes (E6 and E7), early and late genes (E1, E2, E4 and E5) and exclusively late genes (L1 and L2). Transcriptions from promoter p97 and p670 generate pre-mRNAs encoding all the early and late genes, respectively [
46]. Subsequently, the 5’ and 3’ splice sites are directly recognized by splicing factors, such as hnRNP or SR proteins, to either repress or stimulate the use of a specific splice site, which starting the splicing procedure to produce early and late proteins [
47,
48]. For instance, E2 inactivates early polyadenylation signal pAE, causing a switch from early to late gene expression [
49]. A splicing enhancer on E2 mRNA interacts with amino acids 236–286 of cellular RNA binding protein hnRNP G, contributing to specific splicing at the 3’ splice site SA2709 to generate E2 protein [
50].
Besides E2, AS of E6 and E7 must be mentioned since the increased expression of the two oncoproteins strongly facilitate HPV-associated tumorigenesis [
51,
52]. E6 and E7 target p53 and pRB, respectively, to inactivate tumor suppressors through proteasome-mediated degradation [
53]. Notably, E6 and E7 are derived from the same polycistronic transcript, which contains three exons and two introns with three 3’ splice sites in intron 1. AS of intron 1 leads to produce four different alternative spliced mRNAs, E6 full length (E6fl), E6*I, E6*II, and E6*X (also called E6^E7) [
54]. The three putative E6* proteins share the same N-terminal 44 amino acids of E6fl, with the C-terminal truncations or frame shifts into the E7 open reading frame [
55]. Thereinto, E6*I, the most abundant isomer in HPV-related cancers, has been suggested to encode E7 [
56‐
58]. E6/E7 splicing is precisely regulated by the interaction of
cis-acting elements, including branch point sequence (BPS) and splicing silencers, and
trans-acting factors. Several consecutive nucleotides located within the E6-coding region upstream of 3’ splice site SA409, such as AACAAAC for HPV16 and AACUAAC for HPV18, have been identified to be the BPS, which are closely related with the efficiency of E6*I splicing and further affecting the production of E7 [
59]. The crucial point mutation could interrupt BPS binding activity to U2 snRNP, causing inefficient splicing to produce E7 protein. Additionally, splicing silencers have been mapped to interact with hnRNP A1/A2, thereby reducing the expression of E6*I and E7 [
60,
61]. Other
trans-acting factors, such as hnRNP G and SRSF2, could also negatively disrupt the balance of E6/E7 proportion and further cause apoptosis of infected cells [
50,
62]. Since E6/E7 is important for HPV tumorigenicity, regulating AS to manipulate their expression might be the promising therapy to antagonize viral carcinogenesis.
Influenza virus
In addition to DNA viruses, RNA ones have been reported to usurp host splicing mechanism to expand the coding capacity of their limited genes [
63]. The genome of influenza virus (IAV) consists of eight negative-sense RNA segments, and both M and NS genes are well-known to express different spliced transcripts. There are four differentially spliced isoforms from segment 7, M1, M2, M3, and M4. M1 and M2 are essential for viral nuclear export, virion packaging, and progeny budding [
64,
65], and while no known function has been found for M3 and M4 [
66]. M42, an M2-related protein, is expressed from M4 mRNA utilizing an alternative start codon and is hypothesized to be a novel ion channel protein to replace the function of M2 [
67]. Shih et al
. reports that viral polymerase complex and cellular splicing factor SF2/ASF jointly regulate the utilization of alternative 5’ splice sites in M pre-mRNA and control the M2 expression during infection [
68,
69].Other study finds that cellular hnRNP K and NS1-BP proteins direct M segment splicing through binding 5’ splice site of M2 mRNA. Mutation of either or both the hnRNP K and NS1-BP-binding sites results in M segment mis-splicing and attenuated IAV replication [
70]. Liu et al
. further identifies another cellular factor SRSF5 directly involves in M2 production. SRSF5 binds crucial sites 163/709/712 in M pre-mRNA via its RRM2 domain, and recruits U1 snRNP through interacting with U1A to increase M2 expression, subsequently enhancing virus replication in A549 cells and pathogenicity in mice [
71]. Apart from polymerase complex and cellular splicing factors, NS1 has been demonstrated to participate in M2 expression [
72,
73]. Although deleting NS1 gene (DelNS1) usually leads to severe attenuation of IAV in interferon-competent cells, A14U, an adaptive mutation in the 3’ noncoding region of M segment could compensate the replication of DelNS1 through restore M2 expression [
74]. This data suggests that NS1 is involved in IAV replication through modulating the splicing process of M transcripts. Intriguingly, Calderon et al
. shows that avian IAV M segment is prone to enhancing splicing efficiency to produce excessive M2 protein when transcribed in mammalian cells. The aberrant high levels of M2 proton channel prevent fusion between autophagic vesicles with lysosomes, which in turn reducing the efficiency of viral replication and limiting the zoonotic potential of avian IAVs [
75]. This data is the auxiliary evidence for species barrier of avian IAV, however, the exact role of mammalian IAV M2 in host adaptation still needs to be further studied.
The splicing of segment 8 creates mRNAs that encode nonstructural protein 1 (NS1), NS2, and NS3. The full length NS1 is an RNA-binding protein, which is essential for efficient IAV replication and virulence due to its roles in counteracting host immune response and regulating viral protein expression [
76]. Two different 5’ splice sites are used to generate truncated NS2 and NS3 [
77], which also play important roles in virus lifecycle, such as NS2 facilitating virus budding and antagonizing the production of interferon (IFN) [
78,
79], and NS3 stimulating cytokines to promote pathogenicity [
80]. Since NS1 plays critical roles in the splicing of viral genes, it has been reported to block the splicing and nucleocytoplasmic transport of its own mRNA, but not others, through N-terminal region in a transient replication/transcription system, suggesting that NS1 might maximize its function through suppressing the splicing rate [
81‐
84].