SARS-CoV-2 is a positive-sense single-stranded RNA virus whose genome is of a low stability thus is more prone for mutation accumulation, with approximately 9.8 × 10
−4 substitutions/site yearly [
3‐
7]. The architecture of SARS-CoV-2 is made of two groups of proteins: structural proteins (SP) and non-structural proteins (NSP). SPs are encoded by 4 genes, including E (envelop), M (membrane), S (spike) and N (nucleocapsid) genes [
8]. NSPs are mostly enzymes or functional proteins that play a role in viral replication and methylation and may induce host responses to infection. These genes are encoded in several groups, namely ORF1a (NSP1-11), ORF1b (NSP12-16), ORF3a, ORF6, ORF7a, ORF7b, ORF8 and ORF10.
A variant can be as simply as a virus bearing a deviant mutation or complicated combinations of deviations leading to significant phenotypical alteration from original genome. Although by the beginning of May 2021, there has been reported more than 1.4 million sequences and among them 3913 major representative variants genomes that have been identified and included in the global SARS-CoV-2 sequence database operated by Global Initiative on Sharing Avian Influenza Data (GISAID) [
9], not all genetic mutations lead to variation in major proteins and/or alter virus infectivity. The spike gene mutations account for most of the clinically influential VOCs while the ORF1a frame of the genome serves as a key region for NSP mutations.
We will focus our discussion here on the VOCs that have major global health impacts since the 4
th quarter of the year 2020, including Alpha variant (B.1.1.7), Beta (B.1.351), Gamma (P.1) and Kappa and Delta (B.1.617.1 and B.1.617.2) (Table
1).
Table 1
Molecular and clinical characteristics of SARS-CoV-2 variants of concern
Epidemiology | First Identified | September. 2020, UK | October. 2020, South Africa | December. 2020, Japan & Brazil | December. 2020, India |
| Global frequency** | 48% | 7% | 7% | 14% |
| Major geographic distribution | Worldwide | South Africa | South America | Asia |
Predominant mutations | Spike RBD mutations | N501Y | K417N, E484K, N501Y | K417T, E484K, N501Y | L452R, E484Q, T478K (Delta) |
| Spike non- RBD mutations | D614G, P681H | D614G | D614G | D614G, P681R |
Clinical considerations *** | Transmissibility | ↑ **** | ↑? | ↑? | ↑? |
| Virulence | ↑? | ↑? | ↑? | ↑? |
| Host Immune response | ↓ | ↓ | ↓? | ↓? |
| Diagnostic tools | ↔ | ↔ | ↔ | ↔ |
Therapeutic considerations*** | Vaccinations’ effectivity | | | | |
| mRNA- based | ↔ | ↓ | ↔ | ? |
| Adenovirus-based | ↓ | ↓ | ↓ | ? |
| Recombinant protein-based vaccines | ↓ | ↓↓ | ? | ? |
| Inactivated virus-based | ↔ | ↓ | ↔ | ? |
| Potential therapeutic strategies |
| S1 RBD targeted therapeutics: Soluble human recombinant ACE2, anti-RBD nanobodies Endosomal formation interruption: TMPRSS2 inhibitors (e.g., Camostat), ADAM17 inhibitors, Viral replication-oriented therapies: RdRp inhibitors (e.g. Remdesivir, GS-441524), Cas13d-based PAC-MAN |
Spike mutations
Spike protein mediates the virus attachment to human cell surface angiotensin converting enzyme 2 (ACE2) receptor, thus facilitating viral entry during infection [
10‐
12]. It is split into two subunits, S1 and S2. The S1 unit possess the receptor-binding domain (RBD) which can directly bind to ACE2 receptor and is also the dominant target of neutralizing antibodies (Ab) against SARS-CoV-2. S1 is thus considered a hotspot for mutations that may have high clinical relevance in terms of virulence, transmissibility, and host immune evasion [
13‐
16] (Table
2).
Table 2
Major spike mutations
K417 | K417N | RBD | ↑ | ↑ | ↑ | ↑ | | √ | | |
K417T | ↑ | ↑ | ↑ | ↑? | | | √ | |
L452 | L452R | ↑ | ↑ | ↑ | ↑ | | | | √ |
T478 | T478K | ↑ | ↑ | ↑ | ↑? | | | | √ (Delta) |
E484 | E484K | ↑ | ↑ | ↑? | ↑? | √ (partially) | √ | √ | |
E484Q | ↑ | ↑ | ↑? | ↑? | | | | √ |
N501 | N501Y | ↑ | ↑ | ↑ | ↑ | √ | √ | √ | |
D614 | D614G | non-RBD | ↑ | ↑ | ↑ | ↔ | √ | √ | √ | √ |
P681 | P681H | S1/S2 Furin cleavage site | ↔ | ↑? | ↑ | ↑? | √ | | | |
P681R | ↔ | ↑? | ↑ | ↑? | | | | √ |
The Alpha variant has an N501Y mutation: at the 501 residue, N asparagine has been replaced with Y tyrosine, as well as K417N—lysine K replaced with asparagine N [
9]. An emerging variant derived from B.1.1.7 also carries E484K mutation—glutamic acid E replaced with lysine K [
9]. Both Beta and Gamma variants have more substitutions other than N501Y [
9]. The Beta variant has E484K, while the Gamma variant has the E484K and the K417T mutations [
9]. The latest major variants, Delta and Kappa, sharing two mutations E484Q (glutamic acid E substituted by glutamine Q) and L452R (leucine L altered by arginine R) were identified in India’s second COVID-19 wave. Other than the two mutations above, Delta also harbours a unique mutation, T478K (threonine T replaced by lysine K) [
9].
The S1 mutations significantly increases the binding affinity to ACE2 while showing lower affinity to neutralizing antibodies [
17‐
21], suggesting a possible explanation for their occurring higher transmissibility and virulence [
22,
23].
Another mutation at non-RBD sites, named D614G, is the most spreading mutation carried by over 99% of prevalent variants since early 2020 [
23,
24]. Such mutation does not change the binding affinity to ACE2 or neutralizing Abs for the virion, yet it may increase spike density by preserving the integrity of spike and avoiding S1 shedding [
25]. With more functional spikes available, D614G variants are armed with increased infectivity and hence increased replication in vitro while earlier transmission in vivo [
23,
25,
26]. Recently, increasing deletions are observed in the neutralizing Ab-recognizing domain, namely recurrent deletion regions (RDRs), in the N-terminus of S1 subunit [
27]. Deletions in RDRs wipe out the epitopes, and eventually aiding the virus evading host’s immune supervision and potentially defecting certain neutralizing Abs or vaccines. A majority of Alpha derived variants (ΔRDR1, S: ΔHV 69–70, & ΔRDR2, S: ΔY144), Beta derived variants (ΔRDR4, S: ΔLAL 242–244) and B.1.36 (ΔRDR3, S: ΔI210) carry this kind of mutation [
27].
NSP mutations
Two mutation hot-spots, NSP1 of ORF1a/ORF1ab, and ORF8, have been found related to the virulence and transmissibility. NSP1 is a key protein to antagonize type I interferon induction in the host and benefit the replication of the virus itself [
28,
29]. ORF8 is known as an immune-evasive protein that downregulates major histocompatibility complex class I (MHC-I) in host cells [
30,
31]. Recently, the Alpha variant, identified from a single immunocompromised individual, was shown to contains a premature stop codon at position 27 of ORF8[
32].
Variants with partial deletion of NSP1 and ORF8 have been identified (e.g., the NSP1: Δ500-532 variant in Sichuan, China, and the ORF8: Δ382 variant in Singapore) [
29,
31]. Despite that truncated NSP1 and ORF8 both contribute to milder infections [
29‐
31] and account for less than 5% of infections worldwide, they have become the major variants in Africa since late 2020[
9].