Abstract
Objectives: To explore the clinical value of circulating long non-coding RNAs (lncRNAs) as biomarkers to predict fetal congenital heart defects (CHD) in pregnant women. Methods: Differential expression of lncRNAs isolated from the plasma of pregnant women with typical fetal CHD or healthy controls was analyzed by microarray. Gene ontology (GO), pathway and network analysis were performed to study the function of the lncRNAs. Differentially expressed lncRNAs were validated in plasma samples from 62 pregnant women with typical CHD and 62 matched controls by RT-PCR. The sensitivity and specificity of each lncRNA in the diagnosis of fetal CHD was determined by ROC curve analysis. Results: Microarray analysis identified 3694 up-regulated and 3919 down-regulated (fold change ≥2.0) lncRNAs. The top ten significantly differentially expressed, CHD-associated lncRNAs were validated by RT-PCR. Five significantly up-regulated or down-regulated lncRNAs were identified: ENST00000436681, ENST00000422826, AA584040, AA709223 and BX478947 with the AUC of ROC curves calculated as 0.892, 0.817, 0.755, 0.882 and 0.886, respectively. Conclusions: Specific lncRNAs aberrantly expressed in the plasma of pregnant women with typical fetal CHD may play a key role in the development of CHD and may be used as novel biomarkers for prenatal diagnosis of fetal CHD.
Introduction
Congenital heart defects (CHDs), are the most common type of congenital malformation, accounting for approximately 28% of all cases. CHDs include structural abnormalities, including the failure of channels to close or improper development of the heart and large blood vessels during embryonic development. The main clinical manifestations include ventricular septal defects (VSD), atrial septal defects (ASDs) and tetralogy of Fallot (TOF) [1,2] The prevalence of neonatal CHD is approximately 8%, with a perinatal mortality rate as high as 40% and mortality rate as high as 20% in the first month following birth [3]. Fetal mortality rates included deaths from pulmonary hypertension, bacterial endocarditis and congestive heart failure, although complex CHD was the main cause of early death [4]. Early diagnosis of CHD is associated with an improved prognosis. Therefore, prenatal detection of fetal CHD is the key to reduce mortality and improve the prognosis of CHD.
The rate of misdiagnosis in routine antenatal care of cardiac abnormalities is very high, although fetal ultrasound echocardiography can be used as a diagnostic tool for CHD. The diagnostic efficiency is only 6-35%. In addition, results vary significant between testing centers due to a lack of standards for ultrasonic inspection, differences in the quality of ultrasonic diagnostic equipment, the experience of the operator, and policies and guidelines at different centers which may influence the accuracy of ultrasonic diagnosis of fetal CHD. Recently, biomarkers such as increased acylated ghrelin [5], miR-19a [6] and beta human chorionic gonadotropin (HCGβ) [7], or decreased activity of BMP4 [8] and low levels of pregnancy associated plasma protein A (PAPP-A) [7] were found to be associated with fetal CHD, although their use as biomarkers for prenatal screening of fetal CHD is not specific.
LncRNAs are a type of RNA with a length of approximately 200 nucleotides that are not translated. LncRNAs are stable in plasma and urine, and show disease and tissue specificity. To date, more than 1000 lncRNAs have been confirmed to be involved in various biological processes, including cell growth, differentiation, cell proliferation and apoptosis [9,10]. More and more studies suggest that circulating plasma lncRNAs have great potential as new diagnostic and prognostic biomarkers and play important roles in effective evaluation of treatment in disease such as cardiovascular disease and cancer [11,12]. The role of lncRNAs in heart development has been described. Specific fetal CHD-related lncRNAs can be found in the placental tissue in fetal CHD, which suggests that placenta derived lncRNAs may be detected in the peripheral blood of pregnant women.
We hypothesize that circulating lncRNAs in pregnant women can be used as candidate biomarkers for predicting fetal CHD in early pregnancy. In the current study, maternal circulating lncRNAs were screened systematically by Arrarystar lncRNA microarray and verified by RT-PCR using a large number of samples. At the same time, the clinical value of lncRNAs in the early diagnosis of fetal CHD were evaluated.
Materials and Methods
Study design
This study was approved by the hospital ethics committee. Each participant provided written informed consent.
Study participants
In this study, paired cases were used to confirm the predictive effects of circulating lncRNAs from pregnant women in the diagnosis of fetal CHD. Between March, 2015 and September, 2015, 152 out-patient cases were collected at the Nanjing Maternity and Child Care Center. Twenty-five cases with a history of heart or cardiovascular disease, pregnancy complications or multiple pregnancies were excluded. In addition, three cases that were later diagnosed with Down's syndrome were also excluded. In total, 124 cases were included in the study, including 62 cases of diagnosed fetal CHD as the disease group and 62 cases as healthy controls. The disease group included 30 cases of VSD, 18 cases of ASD and 14 cases of TOF, which were confirmed by fetal echocardiography. The ages and gestational ages between the two groups were similar in order to reduce error associated with heterogeneity between the groups.
Study stages
The study was divided into two stages. In the first stage, the lncRNAs from the circulating plasma of three pregnant women with typical fetal CHD and three healthy controls of similar age and gestational age were screened using Arraystar microarray. Gene ontology (GO) analysis and signal pathway analysis were performed. By comparing the relative expression levels of lncRNAs in the two groups of pregnant women, we identified lncRNAs potentially involved in CHD based on biological information such as target genes. From there, we designed the corresponding primers for validation in the second stage. In the second stage, differential expression of lncRNAs was screened in pregnant women with typical fetal CHD and six healthy controls of similar age and gestational age. The top ten significantly up-regulated or down-regulated lncRNAs were chosen for further large scale sample validation. Finally, the diagnostic sensitivity and specificity of the top five significantly up-regulated or down-regulated lncRNAs were evaluated using ROC curve.
Plasma isolation and RNA extraction
Five milliliters of EDTA-coagulated blood from each patient was collected and centrifuged with a speed of 4000 rpm for 10 min. The plasma was stored at -80 °C in 1.5 mL RNase-free microcentrifuge tubes for later use.
Total RNA, including lncRNAs, was extracted from 400 μL of plasma using the TRIZOL LS kit (Life Technologies, USA) according to the manufacturer's instruction. The purity and concentration of the total RNA was determined using a NanoDrop2000 (Thermo Scientific, USA).Total RNA was reverse transcribed into cDNA with a RevertAid First Strand cDNA Synthesis Kit (Thermo Scientific, USA). The reaction was performed in a 60 μL volume containing 1.5 μg total RNA, 6 μL of 10 mM dNTP mixture, 3 μL of the Multiscribe RT enzyme, 12 μL of 5X RT buffer, 3 μL RNase inhibitor, 3 μL random primers and 33 μL water. The reverse transcription program was as follows: 42°C for 60 min, 70°C for 5 min and 4°C hold. The cDNA was preserved at -80°C for later use. The handling and storage of samples was identical to reduce potential intra- and inter-assay error.
Arraystar microarray
The Arraystar LncRNA gene chip is capable of detecting 30586 lncRNAs and 26109 protein encoding transcripts. LncRNAs are selected from public transcription databases (Refseq, UCSC knowngenes, Gencode) or papers with a high impact factor. At the initial stages of the experiment, the lncRNAs in the circulating plasma of three pregnant women with typical fetal CHD and three matched healthy controls were screened using the Arraystar gene chip detection system. Briefly, an equal amount of RNA from each sample was transcribed to cRNA. Sample labeling and array hybridization were carried out following protocols provided for the Arraystar RNA Flash Labeling Kit and Agilent SureHyb. The washed chip was scanned using an Agilent DNA Microarray Scanner and raw data extracted using the Agilent Feature Extraction software (v11.0.1.1). The original data standardization and subsequent data processing was performed using GeneSpring GX v12.1 Software (Agilent Technologies). A high quality probe was used to screen the original data (at least one of the probes was labeled as Present or Marginal). Differentially expressed lncRNAs between the two groups were screened using p-value/FDP and the differential expression of lncRNAs between two samples were screened through fold change. Finally, we performed GO analysis and signal pathway analysis and selected lncRNAs displaying significant changes to be further examined.
GO analysis
The GO project provides a controlled vocabulary to describe gene and gene product. The ontology covers three domains: Biological Process, Cellular Component and Molecular Function.Fisher's exact test is used to find whether there is more overlap between the differential expression list and the GO annotation list than would be expected by chance. The p-value denotes the significance of GO terms enrichment in the differential expression genes. The lower the p-value, the more significant the GO term(p-value ≦0.05).
Quantitative PCR (qPCR)
qPCR was performed in triplicate for each plasma sample with GAPDH as an internal reference using Power SYBR Green PCR Master Mix (Life Technologies, USA) in a LifeProFlex PCR detection system (Life Technologies, USA). The reactions were performed in 10 μL volumes with 1 μL cDNA, 3 μL DEPC-treated water, 5 μL SYBR green master mix, and 0.5 μL each of forward and reverse primers. All primers were designed online and synthesized by Shanghai GENEray Biotech. Primers were validated by amplification of the target gene. The primer sequences and lncRNA IDs are shown in Table 1. The mixture was incubated at 50°C for 2 min then real-time PCR performed according to the manufacturer's protocol (Life Technologies, USA). Cycling parameters were as follows: 95ºC for 10 min for initial denaturation followed by 40 cycles of 95°C for 15 sec and 60ºC for 1 min. The quantitative PCR results were normalized to the reference gene GAPDH. Probe specificity was confirmed by melting curve analysis and gel electrophoresis to identify a single product. The relative expression of target lncRNAs was determined using the comparative cycle threshold (Ct) method (2-ΔΔCT), where ∆∆CT was calculated as the mean ∆Ct of CHD samples minus the mean ∆Ct of controls, where ∆Ct = Ctsample−CtGAPDH.
Statistical Analysis
All data are presented as means ± standard deviation (SD) and analyzed using the SPSS21.0 statistical software package (SPSS Inc., USA). The relative expression levels of lncRNAs between the two groups were estimated using Student's t-test where p<0.05 was considered statistically significant and all statistics were two-sided. The efficiency of lncRNAs in the diagnosis of fetal CHD were evaluated with ROC curves. The sensitivity and specificity of each lncRNA was assessed by the area under the ROC curve (AUC) and 95% confidence interval(CI).
Results
Patient Data
One hundred and twenty-four cases were examined in this study. Sixty-two subjects were pregnant women with fetal CHD while 62 cases were healthy controls. The disease group consisted of 30 cases of VSD, 18 cases of ASD and 14 cases of TOF. All the fetal CHD cases were diagnosed by fetal echocardiography. There were no significant differences in age or gestational weeks between the two groups (Table 2).
Differentially expressed lncRNAs
Expression profiling of lncRNAs identified 17603 differentially expressed lncRNAs in the plasma of pregnant women with typical fetal CHD (Fig. 1). 3694 lncRNAs are significantly up-regulated (fold change ≥2) and 3919 lncRNAs are significantly down-regulated in the CHD group.
GO analysis
GO analysis to describe the lncRNAs in terms of biological processes, cellular components and molecular functions was carried out. Fisher's exact test was used to determine whether there is more overlap between the list of differentially expressed lncRNAs and the GO annotation list than would be expected by chance. The p-value denotes the significance of GO term enrichment in the differentially expressed genes. The lower the p-value, the more significant the GO term (p<0.05). The most highly enriched GOs terms associated with up-regulated transcripts included biological processes, cellular components and molecular functions. The highest enriched GOs target by the down-regulated transcripts were single-organism process (Fig. 2A), organelle (Fig. 2B) and binding (Fig. 2C), while the most highly enriched GOs terms enriched for up-regulated transcripts were regulation of cellular process (Fig. 2D), cell or cell part (Fig. 2E) and binding (Fig. 2F).
Pathway analysis
The top ten signaling pathways involving down-regulated transcripts are shown in Fig. 3A. The most highly enriched signal pathway is “hematopoietic cell lineage”. The top ten signal pathways involving up-regulated transcripts are shown in Fig. 3B, where the most highly enriched signal pathway is “adherens junction”. Amongst these pathways, the RAS signaling pathway has been reported to be involved in hypertrophic cardiomyopathy [13]. The category “morphine addict” has been reported in patients experiencing heart failure [14]. The category “dilated cardiomyopathy” has been associated with patients with ventricular wall stress and cardiac function [15,16]. The category “Wnt signaling pathway” (Fig. 3C), has been reported in heart valve development. Embryonic myocardial cell proliferation and apoptosis can be promoted through this signaling pathway [17,18].
Discovery of biomarkers
The aim of this study was to identify lncRNAs in circulating plasma of pregnant women that can serve as biomarkers for predicting fetal CHD. Differential expression of lncRNAs between three pregnant women with fetal CHD and three matched healthy controls were performed using the Arraystar gene chip. The average expression levels and the fold change of lncRNA were analyzed. A comparison of the relative expression of lncRNAs resulted in the identification of 26 lncRNAs closely related to CHD, based on GO analysis. To further assess the 26 lncRNAs, we designed primers to analyze expression. From a preliminary screening of the lncRNAs is six pregnant women with typical fetal CHD and three healthy controls, we identified the ten most highly differentially expressed transcripts. Four up-regulated lncRNAs (ENST00000436681, ENST00000422826, NR-037608 and NR-046160) and six down-regulated lncRNAs (AA584040, AA709223, BX478947, chr13:104871700-104904225+, AI808306 and AK055084R) were chosen for further large sample validation. These results are shown in Table 3.
Validation of lncRNAs
The ten candidate circulating plasma lncRNAs were validated by RT-PCR with samples obtained from 62 pregnant women with fetal CHD and 62 healthy controls. The quantitative PCR results were normalized to the reference gene GAPDH, and show that five of the ten lncRNAs (ENST00000436681, ENST00000422826, AA584040, AA709223 and BX478947) in the disease group were significantly differentially expressed as compared to the healthy controls (p-values are 0.011, 0.013, 0.002, 0.000 and 0.000, respectively; Table 4 and Fig. 4), while expression levels of the other five lncRNAs was not significantly different between the two groups (p-values are 0.284, 0.198, 0.118, 0.671 and 0.406; Table 4). As shown in Table 4, the SD from the mean expression levels of the five differentially expressed lncRNAs is relatively large, which show a high degree of dispersion. Several values of expression levels are very high. A sub-analysis were performed by excluding extreme values and the differences remained significant. VSD, ASD and TOF samples were compared with their matched normal controls in order to identify lncRNAs up or down-regulated in these specific defects. There are four significantly changed lncRNAs in VSD (ENST00000436681, AA584040, AA709223 and BX478947) and two in ASD (AA709223 and BX478947). Four lncRNAs show significant differential expression in TOF (ENST00000436681, ENST00000422826, AA709223 and BX478947)( Table 5).
Diagnostic value of maternal plasma lncRNAs for fetal CHD
The sensitivity and specificity of the individual maternal plasma lncRNAs for the prediction of fetal CHD were assessed using ROC curve analysis. The AUC of each lncRNA (ENST00000436681, ENST00000422826, AA584040, AA709223 and BX478947) is shown in Fig. 5. The discriminating effect of the five lncRNAs was at the cut-off values of 0.4174, 0.7907, 1.1502, 1.9147 and 1.4227, respectively. The largest Youden's index (sensitivity + specificity -1) was defined as the optimal diagnostic point. The sensitivity and specificity and AUC of the five lncRNAs are shown in Table 6 and Fig. 5. The results show that the five differential expression lncRNAs can be used as effective biomarkers for predicting fetal CHD.
Discussion
CHD is one of the most common types of birth defects in the process of embryonic development, with an incidence in newborns of approximately 4%-5%. It is the most common non-infective factor in deaths of infants and young children [1]. VSD, accounting for about 25%-50% of CHDs, plays an important role in birth defects [19]. To date, there is no gold standard for the diagnosis of fetal CHD, forcing a reliance on fetal echocardiography. Although its diagnostic specificity is high, the rate of misdiagnosis is also quite high.
LncRNAs are a type of non-coding RNA with high tissue and disease specificity. Currently, lncRNAs are a focus of research in tumor and cardiovascular diseases and have become an effective tool for the diagnosis and treatment of cardiovascular diseases and tumors [20,21]. LncRNAs participate in many important regulatory mechanisms such as X-chromosome modification, chromosome silencing, genomic imprinting, transcriptional activation, transcriptional interference and nuclear transport, although they encode little protein [22]. They regulate gene expression at multiple levels such as epigenetic regulation, transcriptional regulation and post transcriptional regulation. Understanding the mechanisms and functional roles of gene regulation is very important in fundamental understanding for cardiac development. Gao et al. [23] reported that the complexity of cardiovascular transcriptome is contributed by RNA editing, RNA splicing and noncoding RNAs. LncRNAs in the heart add intricacy to the regulation network of cardiac gene expression and reveal ways of potential perturbation in response to pathological stressors which demonstrated the potential of these new insights for diagnostic and therapeutic applications. Yang et al. [24] reported that the myocardial transcriptome was regulated in advanced heart failure and lncRNAs markedly altered in response to left ventricular assisted device (LVAD) support which suggest that lncRNAs may play an important functional role in the pathogenesis of heart failure and in reverse remodeling observed with mechanical support.
At present, it is believed that circulating plasma lncRNAs come from cellular apoptosis or necrosis, active release from cells or splitting of circulating cells. Endogenous circulating lncRNAs molecules are not in a free form and often combine with proteins, where they avoid degradation and are more stable so can exist in plasma and urine without becoming targets of enzymatic degradation like other RNA molecules in the blood. They are stable even at room temperature for 24 h or through repeated freeze-thaw cycles [25]. Li et al. [26] reported expression of lncRNAs in the heart, and whole blood even if normal plasma was negatively associated with their nucleic acid length, whereas the expression in plasma during heart failure was not correlated with length, which suggests that the presence of plasma lncRNAs may not be leakage or a passive response of the heart tissues to heart failure, but instead may be due to active secretion from blood cells or bone marrow hematopoietic stem cells. Therefore, the analysis of lncRNA content in plasma can be used as a non-invasive method for diagnosis of diseases.
High-throughput screening methods, such as lncRNA chip or sequencing make it possible to study the differential expression of lncRNAs in different diseases and provides a convenient platform for subsequent functional studies of lncRNAs or biomarker screening. Abnormalities in transcription, expression, structure and regulation of lncRNAs and their associated binding proteins may be an important factor in the occurrence and development of many important diseases including cancer and cardiovascular diseases [27,28]. In this study, plasma lncRNAs isolated from pregnant women with typical fetal CHD and normal healthy controls of similar age and gestational age were screened using an Arraystar gene chip and lncRNAs that might be useful in predicting fetal CHD were examined. We identified 17603 expressed lncRNAs, among which 3694 were up-regulated and 3919 were down-regulated using abundant and varied probes accounting for 30586 lncRNAs and 26109 protein encoding transcripts in a microarray. The functions of most of these lncRNAs have not been characterized. GO analysis and pathway analysis were carried out and an expression network was constructed to explore the biological functions and potential mechanisms of these differentially expressed lncRNAs in plasma of pregnant women with fetal CHD. The top ten significantly differentially expressed lncRNAs were ENST00000436681, ENST00000422826, NR-037608, NR-046160, AA584040, AA709223, BX478947, chr13:104871700-104904225+, AI808306 and AK055084R. They were validated by RT-PCR in all 62 paired cases where we found five to be significantly up-regulated or down-regulated (ENST00000436681, ENST00000422826, AA584040, AA709223 and BX478947). Subsequently, we generated ROC curves using SPSS21.0 statistical analysis software. This shows that these five lncRNAs have a high diagnostic value in predicting fetal CHD. We further examined expression of the five lncRNAs in specific CHD subgroups and found that four lncRNAs (ENST00000436681, AA584040, AA709223 and BX478947) were associated with VSD. Two lncRNAs were associated with ASD (AA709223 and BX478947) and four lncRNAs were associated with TOF (ENST00000436681, ENST00000422826, AA709223 and BX478947). To our knowledge, this is the first exploration of the clinical value of lncRNA dynamic changes in pregnant women, which could be applied in disease screening, diagnosis, prognosis and follow-up treatment.
In conclusion, we have identified differentially expressed lncRNAs which have potential diagnostic value in predicting fetal CHD. We have provided evidence that circulating plasma lncRNAs may serve as novel biomarkers for the diagnosis of fetal CHD and provided a solid foundation for further exploration of the pathogenesis of fetal CHD using biomarkers. In a follow-up study, we will expand sample size to further evaluate the value of circulating plasma lncRNAs in the diagnosis of fetal CHD.
Acknowledgements
This study was supported by grants from the National Natural Science Foundation of China (grant no. 81470376), the National Natural Science Foundation of Jiangsu Province of China (No. BK20141077), and the Nanjing Medical Science and Technique Development Foundation (No. 201402025, YKK14123).
Disclosure Statement
We declare that there is no conflict of interest regarding the publication of this manuscript.
References
M. Gu and A. Zheng contributed equally to this work.