Introduction

Single-cell sequencing is expected to result in a novel understanding of genomic stability at the level of per cell cycle in various cell types [1]. However, in many clinical or biological applications, this requirement cannot be met, as the starting material is limited (e.g., in pre-implantation genetic diagnosis (PGD) or analysis of circulating tumor cells). Whole-genome amplification (WGA) can be used to obtain sufficient material for the genetic analyses of DNA isolated from single cells.

The amplification procedure must ensure high genomic coverage, maintain the inherent sequence composition, and allow reliable quantification of copy-number variation (CNV) [2]. At present, multiple methods of WGA have been developed, each with specific strengths and weaknesses, but with a common defect in that each method distorts the initial template DNA during the course of amplification [3]. The most appropriate method should be selected based on its intended application, for example in the PGD of β-thalassemia.

In this article, we report two commercially available WGA methods: multiple displacement amplification (MDA) and multiple annealing and looping-based amplification cycle (MALBAC). At present, MDA is the classic and most widely used method for whole-genome amplification, which can generate an abundance of long fragment template DNA and offers better coverage and unbiased amplification [4,5,6]. These features are essential for the simultaneous diagnosis of different hereditary diseases. It has been reported that MDA has been used in PGD of many genetic diseases since 2004 [7,8,9]. The allele dropout (ADO) rate in MDA has been reported to range from 10.0% to 38.9% [10]. Multiple annealing and looping-based amplification cycle has been developed for WGA of an individual cell, relying on quasilinear instead of exponential amplification [11]. It has been reported that MALBAC offers better genomic coverage with less amplification bias and exhibits a high level of specificity and reproducibility, and thus represents an improvement based on MDA methods [12,13,14]. Since the first publication of the MALBAC WGA method in December 2012, other studies have been published using different systems: a cultured cell line, a single human sperm [15], single human oocytes [16], and circulating tumor cells. In some studies, detailed comparisons were carried out on technical and biological replicates to confirm the reproducibility of these systems [17].

β-Thalassemia is a group of recessively inherited monogenic disorders, with carrier rates as high as 8.53% in southeast China [18, 19]. Preimplantation genetic diagnosis is used to prevent unnecessary termination of pregnancy in couples with elevated risk for offspring affected by β-thalassemia. Nevertheless, WGA is the foundation for PGD. Although data sets from different WGA methods have been compared in a limited sense, systematic evaluation of the strengths and limitations of each approach through next-generation sequencing (NGS) and Sanger sequencing at the same time in the context of β-thalassemia has been lacking.

The capacity to access genomic CNVs from a single cell by NGS presents an important advance for both basic research and clinical diagnostics, with relevance for multiple fields including cancer genetics, PGD, and prenatal non-invasive fetal genome analysis [13, 20,21,22]. In this study, we assessed the performance of MALBAC and MDA for β-thalassemia genotyping and SNP/CNV detection, starting from single peripheral blood lymphocytes, samples of five cells from homozygous or compound heterozygosity cell lines, and discarded embryos using Sanger sequencing and NGS. Consequently, these results can be used to help select the appropriate method for PGD in β-thalassemia and lay the foundation for PGD of single-gene genetic diseases.

Materials and Methods

Experimental design

The general strategies are summarized in Fig. 1a. The single cells were subjected to WGA using one of the two methods (MALBAC and MDA). Then, a small portion of the WGA product was reamplified using PCR primers for β-globin enrichment, and the product was sequenced by Sanger sequencing. The rest of the WGA product was sequenced by NGS at a low sequencing depth, which is sufficient for measuring CNVs and SNPs in the same run (Fig. 1b).

Fig. 1
figure 1

a Experimental design. We collected peripheral blood, cell lines, and discarded embryos, and carried out MALBAC and MDA at both the single-cell level and the five-cell level. We detected and statistically analyzed the amplification efficiency, positive predictive value, sensitivity, ADO, SNPs, and CV values between the two methods. b Evaluation and testing flow for each sample. A small portion of the WGA products were reamplified by PCR primers for β-globin enrichment, and the product was sequenced by Sanger sequencing. The rest of the WGA product was sequenced by NGS at a low sequencing depth, which is sufficient for measuring CNVs and SNPs in the same run. (colour figure online)

Sample collecting and cell isolation

Eleven peripheral blood samples from β-thalassemia carriers including the most common mutation in Chinese population were confirmed by the reverse dot blot method (Table S1) [23]. Seven compound heterozygosity or homozygous cell lines were epstein-barr virus (EB) virus-transfected, leading to immortalized cell lines (Table S2). The genotypes were confirmed by melting curve analysis [24]. Single lymphocytes were isolated by micromanipulation as previously described [25]. The 15 discarded embryos including those at the cleavage and blastocyst stages were required to be from couples in which at least one of the parents was a carrier for β-thalassemia. Prior to treatment, we obtained ethical approval from the cooperation unit ethics committee and written informed consent from every participant. The blastomeres were biopsied by hyaluronidase and blastocysts were biopsied by laser technology. Each cell was washed three times and transferred into thin-walled 0.2-mL PCR tubes containing lysis buffer. Once the cells were placed into the tube, a trace of washing buffer was aspirated, acting as a negative control.

WGA of single cells and product purification

Multiple displacement amplification was performed using the Repli-g single cell Kit (Qiagen, Germany), while MALBAC was performed using the MALBAC single cell WGA Kit (YK001B). For details, please refer to the instructions supplied with these kits. Amplified DNA products were purified with quick DNA products purification kits and eluted into 50 μL PE buffer.

Examination of mutations in the β-globin gene

We used primers to detect β-thalassemia mutations. Each 50 μL of reaction contained 25 μL of 2× GoldStar best MasterMix (Beijing ComWin Biotech), 40 ng of WGA products or genomic DNA, and 10 pmol/μL primers. The PCR products were subsequently submitted to Sanger sequencing.

Selection and detection of SNPs

The criteria for selecting SNPs were as follows: the carrier rate in the Asian population was >0.1; the SNP could be efficiently amplified with whole-gene amplification products; the SNP was within the region of 1 Mb around the point mutation. Therefore, 30 SNPs were selected upstream and downstream of each mutation site. A portion of the SNP loci and corresponding primer sequence were shown in the supplementary information (Table S4). Multiplex PCR was used for the simultaneous amplification of these SNPs using the GoldStar best MasterMix. The PCR products were pooled with the WGA products for the NGS.

Detection of CNVs

Libraries were constructed for the Illumina HiSeq2500 sequencer. Reads were aligned to the reference genome hg19. We used a hidden Markov model to determine likely diploid regions by comparing the coverage normalized by total reads. The coefficient of variation (CV) value was used to evaluate the degree of CNV dispersion and is calculated by the ratio of the standard deviation of the raw data and the mean of the original data.

Calculation of the ADO rate

The ADO is defined as a failure of one of the two alleles from the parents to be amplified. For example, we conducted single-cell whole-genome amplification (MDA or MALBAC) for blood samples from nine cases, and m cases had accurate diagnoses, while n cases were misdiagnosed compare with their genomic DNA. In other words, there were n cases of ADO. The ADO for mutation sites was n/9. Similarly, the ADO for SNP detection was misdiagnosis/total SNPs.

Sensitivity and positive predictive value

Sensitivity was defined as the number of true positive calls divided by all case groups (true positives + false negatives), and the sensitivity of the experiment was higher for disease diagnosis, the less chance of misdiagnosis. The positive predictive value (PPV) was defined as the number of true positive calls divided by the total number of calls (true positives + false positives), and PPV is an indicator that is most informative for clinicians [20].

Statistical methods

Statistical analysis was performed using GraphPad Prism5 software. The comparison of two sample rates (amplification efficiency, PCR efficiency, and ADO) was evaluated with the χ2-test. The mean difference corresponding to CV values between MDA and MALBAC was analyzed by the two samples paired t test. A P value <0.05 was considered as statistically significant.

Results

Amplification rate and ADO rate for β-thalassemia genotyping at the single-cell level

In total, DNA from 99 single lymphocytes from the different β-globin gene mutation types from peripheral blood (Figure S1a) were amplified by both MALBAC and MDA. For MALBAC, 88 (88.9%) produced a product of dispersive stripe about 300–2000 bp and succeeded in reamplifying the β-globin gene (Fig. 2a), but the remaining 11 lymphocyte samples failed to amplify. The PCR products were detected by Sanger sequencing, in which 62 (70.5%) were correctly diagnosed compared with genomic DNA (Fig. 2b). The PPV and sensitivity were 87.5% and 75.7%, respectively (Table 1). Additionally, the ADO rate using MALBAC for lymphocyte DNA that amplified was 29.5% (Fig. 2c). For the MDA method, the corresponding performance indexes were: 90 (90.9%) for amplification efficiency, 73 (81.2%) for the rate of successful detection of mutation sites, 17 (18.8%) for the ADO rate (Fig. 2a–c). There was no significant difference between the two methods at the single-cell level. However, the PPV and sensitivity for MDA were 85.7% and 95.7%, respectively, which were higher than for MALBAC (Table 1).

Fig. 2
figure 2

A comparison of key parameters between MALBAC and MDA, using data generated from both the single-cell level and the five-cell level. a Amplification success rate, calculated as the ratio of the successful amplification of the β-globin gene to the total number of samples. b Mutation site detection rate, or the number of correct genotype samples divided by the number of successfully amplified samples. c, d Allele dropout, defined as the percentage of homozygous sites in the single-cell samples where the bulk sample is heterozygous at the same nucleotide site. c is only representative of the mutation sites of the β-globin gene; d represents the total number of SNPs

Table 1 The positive predictive value and sensitivity

Amplification rate and ADO rate for β-thalassemia genotyping at the five-cell level

For each homozygous or compound heterozygosity lymphocyte cell line (Figure S1b), approximately five cells were aspirated through micro manipulation, and this was repeated nine times for both MALBAC and MDA. All detection and evaluation procedures were consistent with previous approaches. Surprisingly, we found that the amplification efficiency and the rate of successful detection of mutation sites were greatly increased in both methods compared with the single-cell level. In MALBAC, 96 cases from 99 products (96.97%) were successfully amplified for the β-globin gene, and 77 cases (80.21%) had correctly detected mutation sites, with an ADO rate of 19.79% (Fig. 2a–c). The PPV and sensitivity were 88.5% and 90.6%, respectively (Table 1). For the MDA, 97 cases from 99 products (97.98%) were successfully amplified for the β-globin gene and 90 cases (92.78%) had correctly detected mutation sites, the ADO rate was 7.22% (Fig. 2a–c). There were significant differences between the two methods for the index of accuracy and the ADO (P < 0.05). The PPV was as high as 97.80% and the sensitivity was 94.68% (Table 1).

SNPs and ADO

ADO assessment was also performed by examining SNP loci upstream and downstream of mutation sites. At the single-cell level, the ADO for the MALBAC technique was 84 (14.1%), while it was 58 (9.74%) when using the MDA group (Fig. 2d), which was significantly different (P < 0.05). Similarly, at the multiple-cell level, the ADO rate in the MDA group (2.4%) was lower than that in the MALBAC group (5.87%) (Fig. 2d). Thus, MDA has higher accuracy for SNP analysis.

CNV detection and CV value

Whole-genome sequencing of WGA products for CNV detection can identify chromosome abnormalities efficiently at low sequencing depths (<×0.1) and can evaluate uniformity through the CV value. Figure 3a shows that MALBAC was more stable and homogeneous than MDA at the single-cell level. The mean of CV value for MALBAC was 0.1526, and that of MDA was 0.1878, which was significantly different (P < 0.01). With the increase in cell number, the difference between the two approaches reduced for CNV detection (Fig. 3b). At the multiple-cell level, the mean of CV value for MALBAC was 0.1038 and for MDA was 0.1359, which were not significantly different. However, MALBAC was still more stable and homogeneous than MDA.

Fig. 3
figure 3

a Comparison of CNVs at the single-cell level. b Comparison of CNVs at the five-cell level. c A comparison of CNVs between the single-cell and five-cell level for MALBAC. d A comparison of CNVs between the single-cell and five-cell level for MDA. The five-cell amplification was better than single-cell amplification for both the MALBAC and MDA methods. (*P < 0.05; **P < 0.01; ***P < 0.001)

Comparison of various indicators between the single-cell and five-cell level for each method

With an increase in the initial number of cells, the amplification efficiency (from 90.9 to 97.98%) and the rate of mutation detection (from 81.2 to 92.78%) were more significantly improved for the MDA method. Additionally, the ADO rate dramatically dropped from 17.8 to 7.22% (Table 2). An identical trend was observed when using the MALBAC method, while a significant difference was observed between single-cell and multiple-cell experiments.

Table 2 The comparison of various indicators between SC and 5C for each method

At the same time, we compared the difference in the CV value between the single-cell level and five-cell level for each method. Figure 3c, d show that multiplex cell amplification was better than single-cell amplification for both MALBAC and MDA (P < 0.001).

Evaluation of discarded embryos from the cleavage or blastula stage

Embryos came from in vitro fertilization, and contain sperm surrounding the zona pellucida. Thus, we dissolved the zona pellucida using acidic Tyrode solution (Millipore) to prevent sperm pollution. Figure 4a, b show the six-cell cleavage-stage embryo before and after dissolution. The embryo was washed with PBS three times and then divided into six single cells in lysis buffer, with three cells used for MALBAC and three for MDA. Figure 4c displays the agarose gel electrophoresis of the WGA products. A small portion of the purified products were used as a template to amplify the β-globin gene and SNPs (Fig. 4d, e). The products of the β-globin gene were subsequently submitted for Sanger sequencing. Figure 4f shows partial results of the Sanger sequencing, which shows that all samples had normal genotypes and that the results of the two methods were consistent with each other. The ADO rate was 0 in for these results. The SNPs were reamplified with multiplex PCR, and then the products were mixed with the WGA product for NGS. Table 3 displays the detection of SNPs, and the loci highlighted in red represent instances of ADO. At the same time, we conducted low-depth next-generation sequencing to exclude chromosomal abnormalities (Fig. 4g). The results of CNV were different considerably between the embryonic cells. The subjective morphological score was consistent with the objective chromosome test, which was an abnormal embryo.

Fig. 4
figure 4

Evaluation of discarded cleavage-stage embryos. a, b Six-cell cleavage-stage embryo before and after dissolution of the zona pellucida. c Agarose gel electrophoresis of the WGA products; the diffused strips of 300–2000 bp were MALBAC (3–1, 3–2, 3–3) and 3–0 was the negative control (PBS); the single highlighted bands were MDA (3–1b, 3–2b, 3–3b) and 3–0b was the negative control (PBS). DM2000 and 1 kb ladder (Beijing ComWin Biotech Co., China) were used in the two groups respectively. d, e A small portion of the purified products were used as a template to reamplify the β-globin gene and SNPs. The length of the β-globin gene fragment was 590 bp and the segment of SNP was 180 bp. DM2000 ladder was used in the agarose gel electrophoresis. f Partial results of the Sanger sequencing, which showed that all samples had normal genotypes and that the results of the two methods are consistent with each other. The parents are carrier of β41–42βN. g The consequences of CNV through low-depth NGS (×0.04). (colour figure online)

Table 3 The condition of SNP through NGS

Comparing cleavage-stage biopsies and blastocyst-stage biopsies (Table 4), we found that there were no significant differences in amplification efficiency, success rate of mutation detection, or ADO rate for both methods at different biopsy levels; SNP analysis showed that MALBAC resulted in a higher incidence of ADO than MDA (15.6% vs 10.0% and 11.8% vs 10.3% in cleavage stage and blastocyst stage, respectively), which were not significantly different and is consistent with previous results. The single-cell biopsy was better than the multiple-cell biopsy for the different indexes, which contrasts with previous results and may be related to the embryonic state and biopsy method; the invalid value for MDA was less than that of MALBAC (5.7% and 15.2%, respectively), which was significantly different (P < 0.05). Discarded embryos are not fully representative of the actual situation but lay the foundation for PGD.

Table 4 Comparison of MALBAC and MDA in discarded embryos

Discussion

In this study, we provide a comprehensive comparison between the performance of the MALBAC and MDA methods. The results indicate that the two methods have their own advantages and disadvantages. The choice of which WGA method to use should be determined according to the actual clinical application. In PGD, the embryos that are free of monogenic disorders should be selected. Here, β-thalassemia was used as the disease model to evaluate the genotyping and the SNP/CNV detection between two single-cell amplification methods. A previous study showed that MALBAC was advantageous for SNP and CNV detection when compared with MDA [11]. However, we found that the MDA technique demonstrated lower ADO rates, which suggests that MDA outperforms the MALBAC technique for SNP/mutation calling, and this method enabled us to accurately detect β-thalassemia genotypes and SNP for disease-associated alleles. This is due to the characteristics of the phi29 polymerase, which includes a high replication fidelity because of its 3′−5′ exonuclease activity and proofreading activity. Besides, the fragments of MDA were up to 10 kb, whereas MALBAC was only 300–2000 bp which resulted in MALBAC was more prone to happening allele dropout. Furthermore, the difference in ADO between these two methods would become more significant with the increase in cell numbers due to the higher rate of increase in genomic coverage with the increasing number of cells of MDA compared with that in MALBAC. Meanwhile, we used low-depth NGS to compare the performance of MALBAC and MDA for CNV detection. In agreement with previous observations [11, 26] we found that MALBAC leads to a lower variability in the CV value, a property that makes the method particularly well suited for the detection of CNVs. An addition feature unique to MALBAC is its quasilinear amplification, which reduces the sequence-dependent bias exacerbated by exponential amplification.

Polar body, blastomere cells, and trophectoderm tissue can be used to biopsy in PGD. However, polar body or blastomere cells are more prone to amplification failure, ADO, and contamination with extraneous DNA because they contain a limited amount of material available for genetic analysis. There have been some misdiagnoses reported by the ESHRE consortium [27], and specifically a misdiagnosis for β-thalassemia was reported [28]. In our study, for MDA-amplified DNA, the sensitivity of the SNP/mutation analysis increased with the increasing number of pooled cells, in association with a decrease in ADO rates. A similar trend was detected with DNA amplified with MALBAC, although the effects were not significant. Similarly, CNVs detected from experiments based on single cells or five cells demonstrated a trend for increasing correlation with calls made from unamplified DNA. A previous study reported a comprehensive assessment of MDA products from single or a small number of cells using a SNP array. In that study, the ADO decreased dramatically from 17.9% with a single cell to 0.1% with 10 cells [29]. At the blastocyst stage, we can obtain about 10 cells without causing a significant negative effect on the embryo. Hence, we can choose trophectoderm tissue biopsy to prevent the occurrence of various risks.

The high-quality DNA samples obtained from embryos is also extremely important to improve the accuracy of these results. A limitation of this study is that we were unable to collect high-quality donated embryos, but instead chose abandoned blastomere and blastocyst embryos for our research. Unfortunately, some of the results from the embryonic group were not consistent with earlier findings from the cell lines and peripheral blood samples. We thought that there was a relationship with the embryonic state. The φ29 polymerase, utilized in the MDA method, demonstrates low efficiency when used with fragmented and/or cross-linked DNA since it requires average genomic DNA fragment sizes of ~2 kb to amplify DNA without introducing bias [30, 31]. Multiple annealing and looping-based amplification cycle is generally more tolerant to damaged DNA, explaining the better efficiency for the PCR-based approach than the MDA-based WGA.

Previous studies suggested that trophectoderm biopsies and blastocyst transfers may be more advantageous than cleavage-stage biopsies with respect to the outcome of PGD for monogenetic diseases [32,33,34]. Additionally, discarded embryos are not fully representative of the actual situation but lay the foundation for PGD. Diagnosis in trophectoderm cells destined for the placenta may be considered the earliest form of prenatal diagnosis. In the future, trophectoderm cells could potentially provide sufficient material for multiple genetic tests, and allow the simultaneous diagnosis of more than one genetic defect.

We conclude that MALBAC is the better option for CNV research, while MDA is better suited SNP detection. This is because MALBAC provides higher uniformity and stability in CV value analysis, while MDA has a higher accuracy in mutation and variant detection. In addition, we recommend using MDA to establish a system for PGD in β-thalassemia. This would allow the simultaneous diagnosis of β-thalassemia and other chromosomal diseases though DNA sequencing, and also reduce misdiagnoses caused by ADO.