Background
The discovery of cell-free fetal DNA (cffDNA) in maternal plasma by Lo et al. in 1997 has inspired various non-invasive prenatal screening (NIPS) applications [
1], which avoids the ~ 1:100 chance for miscarriage introduced by invasive sampling. At present, NIPS for common fetal aneuploidies, based on analysis of cffDNA in maternal plasma, has been gradually applied as a first-tier aneuploidy screening strategy in clinical practice [
2,
3]. Previous large-scale clinical studies have revealed high accuracy of NIPS in screening on trisomy 21, 18 and 13, with sensitivity and specificity higher than 95% [
4‐
6].
Importantly, the reliability of NIPS largely depends on the assumption that there is sufficient fetal DNA in the samples tested [
7]. Fetal fraction, the percentage of cell-free DNA (cfDNA) that is from fetal origin in maternal peripheral blood, is generally at the range of 3–30%, with an average of about 13% [
4]. The cffDNA levels are determined by multiple factors, including gestational age, maternal weight and extraction method [
8,
9]. In addition, fetal fraction could further decrease during sample transportation or laboratory work-up in NIPS. Previous researches have shown that the extent of chromosomal abnormalities presented in plasma of women with aneuploidy pregnancies is linearly correlated with the cffDNA fraction, thus the test accuracy of NIPS largely relies on the fetal fraction [
10]. Most current NIPS protocols utilize 4% as the lower fetal fraction cutoff value to ensure a reliable result. However, for pregnancies at early GA stages or obese women who require NIPS, low fetal fraction is the major issue to overcome [
8]. In addition, NIPS misses about 1% chromosomal aneuploidy cases, and the most common factor associated with these false negative results is the low fetal fraction [
11,
12]. As for the aforementioned reasons, it is critical to elevate fetal fraction for achieving convincing NIPS results.
cfDNA is DNA fragments generated from apoptotic cells, which is released into circulation after rapid DNA degradation. The size distribution of these DNA fragments has peaks corresponding to nucleosomes (~ 143 bp) and chromatosomes (nucleosome + linker histone; ~ 166 bp) [
13,
14]. In pregnant women, cffDNA in maternal peripheral blood mainly originates from placental trophoblasts, crossing through the placental barrier [
1]. In 2010, Lo et al. found that cffDNA exhibits a different length distribution comparing to the cfDNA from maternal cells, with a reduced proportion of molecules of 166 bp and an increased proportion of molecules of shorter than 150 bp in maternal plasma, possibly caused by differential nucleosomal packaging during apoptosis, or by differences in the force of nucleosome binding [
15]. Based on these findings, it is theoretically possible to enrich cffDNA fragments from total cfDNA in maternal peripheral blood by size selection.
In the present study, we set out to develop an experimental method for cffDNA enrichment, which increased the mean cffDNA fraction by 1.5–4 times, while the complexity of cfDNA obtained is fully adequate for NIPS. Moreover, we retrospectively tested 1415 clinical samples including 1404 routine clinical NIPS samples and 11 false negative NIPS samples using this new method. Our results demonstrated the cffDNA enrichment strategy can improve the overall performance of NIPS by reducing false negative results as well as the test failure rate.
Methods
DNA extraction and libraries construction
Five to 10 mL of peripheral venous blood was collected from each participating pregnant woman in EDTA-containing tubes or Streck blood collection tubes. The blood samples were first centrifuged at 1600×g for 10 min at 4 °C to separate the plasma from peripheral blood cells. The plasma portion was carefully transferred to a polypropylene tube and subjected to centrifugation at 16,000×g for 10 min at 4 °C to pellet the remaining cells. Cell-free DNA from 600 μL of maternal plasma was extracted using the QIAamp DSP DNA Blood Mini Kit (Qiagen) following the blood and body fluid protocol. End-repairing, adaptor ligation and PCR amplification were performed using Ion Plus Fragment Library Kit (Life Technologies).
Cell-free fetal DNA enrichment
DNA enrichment was performed after end-repairing and before adaptor ligation during NIPS library construction. Magnetic beads with an average particle size of 1 μm were used for the purpose of size-selecting the end-repaired DNA fragments with size smaller than 160 bp. To achieve the highest efficiency, we optimized this step by testing a series of different bead concentrations. Magnetic beads were added to the end-repaired DNA fragments, followed by vibrating the tubes for at least 3 s. The tubes were then suspended for 5 min and transferred to a magnetic rack. The supernatant containing size-selected DNA fragments was then transferred to another tube for adaptor ligation.
Sequencing
DNA library concentration was determined by Qubit and RT-PCR. For DNA sequencing, 15–20 libraries were pooled and sequenced using JingXin BioelectronSeq 4000 System (CFDA registration permit NO. 20153400309) semiconductor sequencer with single-end sequencing mode in 400 flows producing raw sequencing reads with size of up to 200 bp and counts of at least 3.5 million.
Data analysis
Reads trimmed from the 3′ end by sequencing quality value of > 15 and filtering by reads with length of shorter than 50 bp were aligned to the human genomic reference sequences (hg19) using the BWA [
16]. Reads that were unmapped or had multiple primary alignment records were filtered by FLAG field in the alignment file, using an in-house Perl script. Duplicate reads were identified by Picard (
http://picard.sourceforge.net/). The remaining reads were considered unique reads for further analysis. To eliminate the effect of GC bias, we calculated the number of unique reads for each 20 kb-bin, then applying an integrated method for GC correction using a three-step process: Locally weighted scatterplot smoothing (LOESS) regression [
17], intrarun normalization [
18], and linear model regression [
19]. LOESS regression was performed in R software with default parameters. We derived a z score for each of the chromosomes in a test sample by subtracting the mean chromosome ratio in a reference set of euploid control pregnancies from the chromosome ratio in a test case and dividing by the SD of the chromosome ratio in the reference set according to the following equation: a cutoff value of z score > 3 was used to determine whether the ratio of the chromosome was increased and hence fetal trisomy was present.
Estimation of cffDNA fraction
Two types of methods were used to calculate the fetal DNA fraction in maternal plasma. For one method, the cffDNA fraction for pregnancy with a male fetus can be easily estimated using reads proportion on the Y chromosome. For the other method, the cffDNA fraction can be estimated using length distribution of cffDNA. Fetal DNA is generally shorter than maternal DNA, Plasma samples with a higher fetal DNA fraction would have a higher proportion of short plasma DNA fragments (~ 130–140 bp; region A) and a lower proportion of long plasma DNA fragments (~ 155–175 bp; region B). LOESS regression was applied to fit the fetal fraction against reads ratio in features A and B. We obtained the LOESS fit-predicted fetal fraction PA for feature A and PB for feature B. Because both PA and PB predict the fetal DNA fraction, PA and PB should also closely correlate. Therefore, we used reference samples to compare PA and PB and thus, identify instances of poor correlation. If Pdiff = (PA − PB) × 2/(PA + PB) is larger than 0.40 (larger than 99% normal samples), PA and PB are inconsistent, and the fetal fraction is considered unpredictable. Otherwise, we calculated the final predicted fetal fraction using P = (PA + PB)/2.
DNA complexity calculation
In order to evaluate the DNA loss during the enrichment procedure, the amounts of sequencing libraries with or without enrichment were calculated using qPCR and KAPA Library Quantification Kit. The qPCR with primers of 5′-CCTCTCTATGGGCAGTCGGTG-3′ and 5′-CCTGCGTGTCTCCGACTCAG-3′ was performed using SYBR Green Realtime PCR Master Mix (TOYOBO). Besides the amounts, DNA complexity was a major factor for PCR amplification and sequencing. Lower DNA complexity would result with higher percentage of duplicated reads when sequencing, leading to the test failures of NIPS. Thus, we applied a method of captured sequencing covering 300 SNPs for libraries constructed with and without beads enrichment. Capture probes (data not provided) were designed and synthetized by Agilent. DNA hybridization and sequencing were performed according to the manufacturer’s instructions. After reads mapping and SNP calling, DNA complexities were calculated using the following equation:
$$ DNA \;complexity = \frac{{ {\text{unique reads covered SNPs in design }}}}{\text{SNP number in panel}}. $$
NIPS sample cohort
10,000 NIPS data, 1404 regular NIPS cases and 11 false negative cases were recruited in this study. All participants signed informed written consent before blood collection. This study was approved by the institutional review board of the Affiliated Obstetrics and Gynecology Hospital of Nanjing Medical University.
Confirmation of original NIPS results
Pregnancies with positive NIPS results were recommended for confirmatory invasive prenatal diagnosis using amniocentesis following karyotyping and/or chromosomal microarray analysis (CMA). Pregnancies with negative NIPS results were interviewed at 3 months after delivery to record the information, including the ultrasound examination, pregnancy outcomes, newborn physical examination results, and neonatal/fetal cytogenetic analysis.
Statistical analysis
Statistical analysis between the different groups was performed using a Chi square (X2) test or Fisher’s exact test, and P-values of ≤ 0.01 were considered statistically significant.
Discussion
In this study, we developed a working experimental method for cffDNA enrichment, which can effectively increase the mean cffDNA fraction and obtain fully adequate cfDNA for NIPS. By retrospectively performing our new method on the 1415 clinical samples including 1404 routine clinical NIPS samples and 11 false negative NIPS samples, we were able to improve the test quality in the way of reducing test-failure rate and false negative rate.
To date, several attempts have been reported for fetal cfDNA enrichment in NIPS technology. Qiwei Yang et al. reported a PCR-based enrichment method to selectively amplify the fetal cfDNA [
20]. Another work from Joaquim Vong et al. reported a single-strand DNA library preparation method to enrich the short cfDNA in maternal plasma [
21]. Moreover, Stephanie Yu et al. reported a size-based method instead of count-based method for NIPS recently, which also showed a promising ability in detecting common trisomies [
22]. These works all focused on the size difference of the cffDNA in maternal plasma to make progress in NIPS. However, evaluation on the feasibility of these methods still need validation by sufficient clinical samples. In this study, to ensure a reliable result, clinical samples with detailed follow-up information, including the fetal karyotype/CMA results or postnatal interview, were used to validate our new NIPS method.
The existence of discordant results, including false positive and false negative, has been regarded as one of the major limitations in NIPS [
23]. In clinic, invasive confirmatory diagnosis will be recommended to all the NIPS positive cases to minimize the adverse effects of false positive results. While the false negative cases would often cause harmful consequences. Therefore, it is critical to uncover the underlying mechanisms of these false negatives results and improve the testing method accordingly. It is widely accepted that the false negative NIPS result is closely associated with low fetal fraction and true fetal mosaicism (TFM) [
23]. Although most current NIPS protocol set up a fetal fraction threshold of 4% for the reliable testing result, there is still a chance of false negative when the fetal fraction is low, such as slightly higher than 4%. As suggested in the previous literature, all false negative NIPS cases are recommended to undergo fetal cfDNA enrichment, which can help to identify low fetal fraction as the potential cause [
24]. Our results showed about half of the false negative cases (5/11) could be avoided by the new NIPS method. Interestingly, the 5 corrected false negative samples have relatively lower fetal fractions (4.1–6.4%) than the other 6 samples (7.1–12.4%), which also indicated the false negative results in these 5 cases could be due to the low fetal fraction. While the other 6 false negative cases may associate with other causes, such as TFM. However, the placental tissue, which is not available in these cases, is required for the confirmation of TFM.
By using our method with cffDNA enrichment, we were able to avoid 9 test-failure results, compared to ordinary NIPS method on the 1404 clinical samples. Importantly, our clinical follow-up information confirmed those cases with original test-failure results were all true negative. In clinical circumstance, cases with ‘no-call’ result were recommended a blood re-draw and another test, but no successful result can be guaranteed. Avoiding test-failures will result in a shorter turnaround time as well as a lower cost for the test, which should also generate less anxiety for the women.
One downside of our strategy is the elevated overall false-positive rates. In this study, there were 2 additional false positive cases among the results of the 1404 clinical samples. As confined placental mosaicism is considered as the major factor causing false positive [
23], the placental chromosomal abnormalities could be amplified with cffDNA enrichment. However, our results indicated the decreased specificity (99.8% vs 99.6%) and increased false positive rate (0.6% vs 0.7%) in the test cohort is limited, and still comparable to other studies [
25]. A larger scale validation of this method would be favorable in the future. Another disadvantage is that the original fetal fraction information would be lost using the new method. It is reported the level of fetal fraction could be related to some pregnancy complications, such as spontaneous preterm delivery, intrauterine growth retardation (IUGR) and pre-eclampsia in asymptomatic pregnant women, suggesting its potential diagnostic value [
26‐
28]. While after fetal cfDNA enrichment, the original fetal fraction information is no longer available. Although this is beyond the scope of this study, further research may be addressed to determine whether there is any association between the original fetal fraction, the enriched fetal fraction and pregnancy complications.
It is important to accurately estimate the fetal fraction in NIPS. Previously, cffDNA levels were examined by qPCR [
1]. When massive parallel sequencing is performed on both the maternal and cffDNA in a given plasma sample, the fetal fraction can be determined by examining genetic elements that differ between maternal and fetal DNA [
29], including Y chromosomal markers [
30], polymorphic markers [
31], DNA methylation markers [
32], and size-, count- and nucleosome profile-based methods [
22,
33]. Genes on the Y chromosome are the most commonly used distinguishing marker [
11,
12], but this method can be only used in pregnancies with male fetuses. Of note, cfDNA in maternal plasma is readily digested into small fragments by natural processes. Because of the small size of these fragments, no additional shearing is required before sequencing. Meanwhile, the fetal and maternal derived DNA fragments exhibit a difference in the distributions of size peaks [
15]. Based on these characters, size-based estimation of fetal DNA fraction was established for the pregnancies with both male and female fetuses, that was also performed in this and our previous paper [
34].
Although it is still controversial to expand the scope of NIPS at present, the technology to use cfDNA for detecting fetal copy number variations (CNVs) and single gene disorders has been developed for years. Several groups utilized whole genome sequencing, SNP-based and targeted sequencing of maternal plasma DNA and showed its huge potential for the detection of fetal microdeletion/microduplication syndromes [
35‐
38]. However, the core statistical procedure is to compare the reads dosage on the target region between testing sample and normal controls, which is highly dependent on the cffDNA fraction. Previous studies showed that the minimal cffDNA fraction requirement for this purpose was 10% [
34]. In addition, the cfDNA-based NIPS for single-gene disorders is much more challenging, because the cfDNA in maternal plasma is generally of minor population, hampering the reliable deduction of the maternal inherence of pathogenic variants at single-nucleotide resolution. Technologically, the development of relative haplotype dosage analysis (RHDO), which utilizes information regarding parental haplotypes flanking the variants of interest, has been demonstrated to greatly improve the accuracy of single-gene disorder detection [
39]. Therefore, our method of cffDNA enrichment could contribute in expanding NIPS to the prenatal detection of fetal CNVs and single gene disorders in the future.
Authors’ contributions
PH, DL and ZX designed the experiments. PH, DL, YL, HLi and YC carried out the experiments. DL and HL wrote the manuscript. HL and ZX supervised the project. DL and YL analyzed the data, FQ, TW, CP and DLuo collected the samples. All authors provided critical comments and editorial modifications. All authors read and approved the final manuscript.