Background
Long QT syndrome (LQTS) is a heritable cardiac disorder characterised by a prolonged QT-interval detected on an electrocardiogram (ECG), episodes of syncope, and the risk of sudden death. The estimated number of affected people is 1 in 2,500 [
1,
2], and 13 causative genes have been identified. The majority of these genes encode for the cardiac ion channels (potassium, sodium and calcium). Loss-of-function mutations in the
KCNQ1 and
KCNH2 genes (LQT1 and LQT2, respectively) account for ~65% of all LQTS cases, and gain-of-function mutations in the
SCN5A gene (LQT3) account for 10% of all LQTS cases [
3,
4]. Genotype-phenotype correlation studies have established genotype-specific ECG patterns, arrhythmia triggers and outcomes [
5,
6]. Genetic diagnosis permits better risk stratification, clinical management and cascade family screening [
7].
As genetic testing becomes more widely available, an increasing number of variants of unknown/uncertain significance are being discovered. Before family cascade screening can be undertaken, it is essential to be confident of the pathogenicity of the variant. While nonsense and frameshift mutations, which cause premature termination of protein production, are usually pathogenic, missense mutations are very commonly benign. The interpretation of novel single nucleotide variants (SNVs) is difficult. SNVs represent ~75% of clinically positive LQTS test results [
8]. Approximately 1 in 25 healthy individuals are expected to have a rare, benign variant in one of the three major LQTS genes [
9,
10]. The best methods to assess the pathogenicity of novel mutations are to undertake phenotype-genotype family co-segregation studies, and functional/biophysical studies in an
in vitro system or animal models. However, many families are too small, and the latter two methods are expensive and time-consuming and unavailable to most diagnostic services. With the growing number of unclassified SNVs, many diagnostic laboratories use
in silico missense mutation prediction tools to determine whether a novel SNV is pathogenic in relation to the evolutionary conservation of specific amino acids, as well as protein structure and function.
The prediction tools can be broadly divided into three categories: sequence and evolutionary conservation-based methods, protein sequence and structure-based methods, and supervised learning methods (refer to [
3,
11-
13] for reviews of the different categories). The sequence and evolutionary conservation-based methods assess the pathogenicity of a mutation based on the conservation of a particular amino acid across different species [
14-
16]. Protein sequence and structure-based methods assess SNVs based on their location in the protein structure and how they may impact/disrupt the overall protein [
17]. The supervised learning methods are trained on large defined datasets so that they can “learn” to distinguish pathogenic mutations from benign variants [
15,
18,
19].
Several studies have investigated the use of multiple
in silico prediction tools to assess the predictive accuracy of these tools [
20-
28]. There are also consensus programs (metservers) that combine the output from several
in silico prediction tools and produce a single consensus outcome, all of which have been reported to offer improved performance over individual tools [
29-
31]. However, no study has investigated the combination(s) of individual
in silico prediction tools with the best performance for LQTS genes or whether the use of metaservers is better.
The aim of the study described here was to assess the predictive accuracy of five
in silico programs (SIFT, PolyPhen-2, PROVEAN, SNPs&GO and SNAP), alone or in combination, and two metaservers (Meta-SNP and PredictSNP) in evaluating SNVs in the three major LQTS genes (
KCNQ1,
KCNH2 and
SCN5A). All mutations reported in the Inherited Arrhythmia Database [
32] for these genes were analysed to determine the combination of the
in silico programs with the best performance, and how the combinations compared to the use of the metaservers.
Methods
All LQTS gene SNVs (both deleterious, polymorphisms and rare SNVs) were collated from the Inherited Arrhythmia Database (
http://www.fsm.it/cardmoc/ last accessed October 2014), Kapplinger et al. [
33], Giudicessi et al. [
5] and the LQTS gene LOVD database [
34]. Only SNVs that caused missense amino acid changes were considered for analysis. For this study the SNVs were divided into two groups: pathogenic and benign. To be considered for the pathogenic group, SNVs must either be functionally characterised by
in vitro and/or have undergone co-segregation studies to prove they are pathogenic. In the case of the benign group, SNVs must either be functionally characterised by
in vitro studies and/or must have an allele frequency of greater than 1%. SNVs in the
SCN5A gene reported by Kapplinger et al. [
33] were found in either LQTS and/or Brugada syndrome patients. All the mutations that were analysed and their respective results are shown in Additional file
1: Data tables, and their locations in respect of the different protein regions (transmembrane domains, pore regions, etc.) are shown in Additional file
2: Figures S1–S3.
Five
in silico missense mutation prediction tools and two metaservers, listed in Table
1, were used to analyse all the LQTS genes SNVs and the exact methodology and algorithms used by each of these have been described previously according to the references given here: PolyPhen-2 [
17], SIFT [
14], PROVEAN [
16], SNPs&GO [
18,
35], SNAP [
19], Meta-SNP [
30] and PredictSNP [
31].
Table 1
In silico
prediction tools and metaservers used in the current study
PolyPhen-2, version 2.2.2 [ 17] | Protein sequence and structure | | | |
| Supervised learning (support vector machine) | | | |
SIFT, version 5.2.0 [ 14, 15] | Sequence and evolutionary conservation | | | |
PROVEAN, version 1.1 [ 16] | Sequence and evolutionary conservation | | | |
| Supervised learning (neural networks) | | | |
| Metaserver | PANTHER | Sequence and evolutionary conservation | |
PhD-SNP | Supervised-learning (support vector machines) |
SIFT | Sequence and evolutionary conservation |
SNAP | Supervised learning (neural networks) |
| Metaserver | MAPP | Sequence and evolutionary conservation | |
PhD-SNP | Supervised-learning (support vector machines) |
PolyPhen-1 | Protein sequence and structure |
PolyPhen-2 | Protein sequence and structure |
SIFT | Sequence and evolutionary conservation |
SNAP | Supervised learning (neural networks) |
PolyPhen-2 uses annotated UniProt entries to predict if a missense mutation is situated in a structurally important/functional site in the protein using a naïve Bayesian approach [
17]. Therefore, PolyPhen-2 could belong to the supervised-learning method of
in silico analysis [
13]. LQTS SNVs were analysed by accessing the web-based method PolyPhen-2, using default settings [
17]. PolyPhen-2 classified each variant as “Probably damaging”, “Possibly damaging”, or “Benign”. For the purposes of this study, SNVs assigned as “Probably damaging” or “Possibly damaging” were classified as “damaging” for downstream analysis.
SNPs&GO incorporates sequence information, evolutionary information, and information from the GO (Gene Ontology) database [
18]. The default settings were used. SNPs&GO classified each mutation variant as either “Neutral” or “Disease” [
15,
18].
SIFT uses sequence homology from multiple sequence alignments to predict the pathogenicity of a mutation [
14,
15]. SIFT non-synonymous single nucleotide variants (genome-scale) was used. The chromosomal location, genomic coordinate, transcript orientation and base-pair change of each SNVs were required for the SIFT nonsynonymous single nucleotide variants (genome-scale) input format. The web-based Variant Effect Predictor – Ensembl (
http://asia.ensembl.org/info/docs/tools/vep/index.html) was used to generate the required information. The default settings were used for this study. SIFT classified each variant as either “Tolerated” or “Damaging”.
The web-based PROVEAN Protein Batch Human was used [
16]. The PROVEAN algorithm classified each SNV as either “Neutral” or “Deleterious”.
SNAP makes predictions based on protein secondary structure, solvent accessibility and the conservation of the amino acid of interest in a protein [
19]. The default settings were used. SNAP classified each mutation as either “Neutral” or “Non-neutral” [
19].
Meta-SNP [
30] and PredictSNP [
31] are metaservers that combine the predicted outcomes from several
in silico tools to form a consensus prediction for a given SNV. Meta-SNP uses a random forest approach to integrate the predictions from
in silico tools [
30] (these are listed in Table
1). Each mutation is classified as either “Disease” or “Neutral” [
30]. PredictSNP is a consensus classifier that integrates the results from six
in silico prediction tools (listed in Table
1) as well as experimental annotations from Protein Mutant Database and UniProt [
31].
The raw data output of all LQTS gene SNVs can be found in the Additional file
3: Raw Data. The functional predictions of all LQTS gene SNVs from all five
in silico tools and two metaservers were collated in Excel spreadsheets. The location of each LQTS gene mutation in the context of protein structure, as well as information about whether the SNVs were functionally characterised are shown in Additional file
1: Data tables and Additional file
2: Figures S1–S3.
The data output were initially compared with the functional studies’ results to determine the accuracy of the programs. SNVs that were not functionally characterised but were proven to be pathogenic through co-segregation studies, or had an allelic frequency of greater than 1%, were assumed to be pathogenic or benign, respectively. These results were subcategorised into four groups: true positives (TP, correct predictions for deleterious mutations), true negatives (TN, correct predictions for neutral mutations), false positives (FP, incorrect predictions for neutral mutations) and false negatives (FN, incorrect predictions for deleterious mutations). The sensitivity (true positive rate) and specificity (true negative rate) for each
in silico tool were determined using the four different categories. Sensitivity was defined as the probability of identifying true deleterious mutations, and this was calculated by [TP/(TP + FN)] x 100 [
36]. Specificity was defined as the probability of identifying true neutral mutations, and this was calculated by [TN/(TN + FP)] x 100 [
36]. The Matthews Correlation Coefficient (MCC) was also calculated for each category using the following equation (TP x TN) – (FP x FN)/sqrt((TP + FP)(TP + FN)(TN + FP)(TN + FN)) [
37]. The MCC measures how the predictions correlate with the real target values, and the scores range from +1 (always correct) to −1 (always false), and 0 represents a completely random prediction [
36]. An MCC score of more than 0.5 was considered acceptable as this corresponds to more than 75% accuracy in balanced data [
38].
Receiver operating characteristic (ROC) curves and the area under the ROC curve (AUC) [
39] for each of the single
in silico tools and the two metaservers were calculated using the pROC package in R [
40]. ROC curve plots sensitivity against 1-specificity, which depicts the relative tradeoffs between the true positives and false positives [
39]. The probability scores of each
in silico prediction tool were used to calculate the curves and AUC. An AUC of 1 represents a perfect prediction, an AUC of 0.5 relates to predictions that are made by “pure chance”, and an AUC less than 0.5 shows the predictions are wrong.
The sensitivity percentage represents how well the
in silico tools and metaservers correctly predict pathogenicity, and the specificity percentage represents how well they correctly predict non-pathogenic outcomes [
36]. MCC, ROC curve and AUC were a more balanced overall evaluation of the ability of the prediction tools to correctly classify the SNVs compared to just analysing the accuracy. ROC curves do not directly indicate the performance of a method, but only shows the method’s ranking potential for its overall performance [
38].
The data output for each
in silico tool were then analysed in combinations of two, three, four or all five
in silico tools, and the accuracy of the predictions, sensitivity, specificity and MCC were determined for each combination. The conditions that an SNV must meet in order to be categorised as “Tolerated” or “Damaging” in the case of being assessed by two, three, four or all five
in silico missense prediction tools are shown in Table
2. The AUC scores were used as an indicator as to which
in silico tools performed best for a particular LQTS gene.
Table 2
Conditions for SNV data output from two, three, four and all
in silico
missense prediction tools in order to be considered tolerated and damaging
Two tools | Unanimous neutral/tolerated/benign | Unanimous damaging/disease/deleterious/non-neutral |
| One output is damaging/disease/deleterious/non-neutral |
Three tools | Unanimous neutral/tolerated/benign | Unanimous damaging/disease/deleterious/non-neutral |
Two outputs are neutral/tolerated/benign | Two outputs are damaging/disease/deleterious/non-neutral |
Four tools | Unanimous neutral/tolerated/benign | Unanimous damaging/disease/deleterious/non-neutral |
Three outputs are neutral/tolerate/benign | Two or more outputs are damaging/disease/deleterious/non-neutral |
All tools | Unanimous neutral/tolerated/benign | Unanimous damaging/disease/deleterious/non-neutral |
Three outputs are neutral/tolerated/benign | Three or more outputs are damaging/disease/deleterious/non-neutral |
The differences between single tools and differing combinations of tools in accuracy was calculated using the Kruskal-Wallis test [
41]. The best performing
in silico tool/combination of tools were chosen based on their MCC scores, which is a more balanced approach to investigate performance because it is less sensitive to different numbers of pathogenic and benign SNVs for each gene [
36].
Discussion
The current study investigated the combination of
in silico prediction tools with the best performance for LQT 1–3 genes. The analysis was restricted to mutations in these genes as they account for approximately 70%-75% of congenital LQTS cases, and the remaining genes only make up ~5% of cases [
42]. For the minor genes, very few SNVs (both pathogenic and benign) fit the strict criteria set for the current study and therefore these genes were not analysed. In the case of
SCN5A, both LQTS and BrS SNVs were analysed together. This was done as the distinction between both LQT3 and Brugada syndrome is not clear cut as some mutations in the
SCN5A gene are associated with both diseases [
43-
46]; however, the combination of
in silico tools with the best performance remained the same regardless of whether the LQTS and BrS mutations were separated or not.
PolyPhen-2, SIFT, PROVEAN and SNPs&GO were chosen as they are routinely used in the author’s diagnostic laboratory. Both PolyPhen-2 and SIFT are the most common prediction tools used in diagnostic laboratories. SNAP was chosen as another supervised-learning based tool that incorporates a wider range of features (evolutionary information, structural features and protein annotation information) [
47], and it uses a different learning algorithm than SNPs&GO. Both metaservers (Meta-SNP and PredictSNP) were chosen as they incorporate the results from a good selection of
in silico prediction tools that span different types of methods.
The approach that was used to categorise variants as “pathogenic” or “benign” based on combined results from the five
in silico prediction tools as shown in Table
2. This approach was taken to ensure that all likely pathogenic SNVs would not be missed and that benign SNVs were correctly called. “Over-calling” pathogenic SNVs may occur when an even number of
in silico tools are used as the conditions set out for this study “call” an SNV with equal numbers of “pathogenic” and “benign” results as “pathogenic”. For
KCNQ1, 4%-15% of SNVs;
KCNH2, 5%-21% of SNVs; and
SCN5A, 8%-24% of SNVs fall into this category. However, these conditions ensured that the classification of “benign” SNVs is more stringent, and so would minimise the chance of a possible pathogenic SNVs being classified as “benign” and therefore dismissed.
In silico tools that correctly predict pathogenic variants do not necessarily perform well for benign predictions. The combinations that were chosen for all three genes were based on how well the prediction tools were able to identify both pathogenic and benign SNVs, thereby reducing the number of false positive and false negative calls. The tools chosen for each gene do not necessarily yield the highest accuracy. In the clinical setting it may in fact be preferable to use an in silico tool which is likely to under-call the likelihood of malignancy in favour of improved specificity. This will mean that in a family cascade, fewer people will be erroneously labelled as having the disease on the basis of a genetic test result. In the case of calling a truly pathogenic variant as benign, the clinician must use clinical evaluation of family members in order to reveal the truth in a segregation study. In the current study, the optimal number of in silico tools for KNCQ1, KCNH2, and when all genes are considered together, is three. Only SCN5A requires two in silico tools to make the best predictions. Therefore, the “over-calling” issue should be minimised.
The current study also highlights the need to systematically test which combinations of
in silico tools perform the best for a given gene, and not assume that a large number of programs will provide the best prediction outcomes. The results from these
in silico methods disagree frequently due to the different algorithms they are based on [
21,
23]. When considering the predictions for pathogenic
KCNQ1 SNVs by all five
in silico tools, 82 SNVs (81%) had concordant results; however, when only considering
KCNQ1’s combination of tools with the best performance (PROVEAN, SNPs&GO and SIFT), an additional six SNVs were agreed upon (88 SNVs, 88%; Additional file
2). In the case of benign
KCNQ1 SNVs, there were no improvements when considering only the results of the tools with the best performance compared to considering all five tools (two SNVs, 25%; Additional file
2). All five tools agreed for 58
KCNH2 pathogenic SNVs (71%), and this increased to 68 SNVs (83%) when only considering the results from the combination of tools with the best performance. There were three benign SNVs (37.5%) that had the same predictions for all five tools and this increased by one SNV (4, 50%; Additional file
2) when considering the tools with the best performance. In the case of pathogenic
SCN5A gene mutations, 61 SNVs (62%) had the same predictions for all five
in silico tools and this increased to 79 SNVs (80%) when only considering PROVEAN and SNAP (Additional file
2). Five benign
SCN5A SNVs (36%) had the same predictions for all five tools and this increased by two SNVs (7, 50%) when only considering PROVEAN and SNAP (Additional file
2). When considering the predictions for all pathogenic SNVs, 201 SNVs (71%) had the same predictions for all five
in silico tools; however, when considering the results from PROVEAN, SIFT and SNAP, this increased by 26 SNVs (227, 81%; Additional file
2). For all benign mutations, only 10 SNVs (33%) had the same predictions from all five tools, and this increased by four SNVs (14, 47%) when only PROVEAN, SIFT and SNAP were considered (Additional file
2).
Both
KCNQ1 and
KCNH2 encode for cardiac potassium channels and the structure of these two proteins are very similar. This could be the reason why the combination of
in silico prediction tools with the best performance are the same for these two genes (PROVEAN, SNPs&GO and SIFT), with
KCNH2 having SIFT and PROVEAN as an additional combination. The reason for both SIFT and PROVEAN working well together could be because they belong to the sequence and evolutionary conservation-based evaluation method, which relies solely on evolutionary sequence conservation information and does not take into account protein structural information (unlike PolyPhen-2) [
14-
16]. Studies conducted by
Chan et al. [
20] showed that methods based on evolutionary sequence conservation had high predictive values regardless of whether protein information is used. The addition of SNPs&GO to SIFT and PROVEAN could be because SNPs&GO uses evolutionary derived information [
18], which is similar to SIFT and PROVEAN, and the inclusion of the information from the Gene Ontology database makes this combination best suited for the
KCNQ1 and
KCNH2 genes.
No prediction tools are considered suitable for analysing variants in the
SCN5A gene from this study. The results for tools that incorporate protein structure and function into their algorithms (PolyPhen-2, SNPs&GO and SNAP) had low specificity suggesting that functional and structural information hampered predictions for variants in the
SCN5A gene. This may be due to the two different
SCN5A isoforms present in the normal human heart. The isoforms differ by only one amino acid (NP_000326 has glutamine-1077 deleted compared to NP_932173) [
48]. The transcript encoding for NP_000326 represents 65% of the
SCN5A gene in the normal heart [
48], and depending on which isoform the SNV is present in, the effect of the mutation may differ [
49]. A study investigating the functional characteristics of eight common
SCN5A gene polymorphisms found five of the eight polymorphisms were similar to the unaffected SCN5A protein in the NP_000326 isoform, and only three of the eight were similar to the unaffected SCN5A protein in the NP_932173 isoform [
49]. The polymorphisms that affected the function of the SCN5A protein, regardless of which isoform they were present in, affected the protein in different ways [
49]. These results could account for the large number of polymorphisms characterised as damaging in this study (14 of 18 polymorphisms were classified as damaging; Additional file
1: Data Tables); of relevance here is the need to specify the protein isoform in some of the
in silico programs.
Another confounding factor in the
in silico analysis may lie in the fact that some
SCN5A SNVs can cause both LQTS and BrS. Gain-of-function mutations are associated with BrS and loss-of-function mutations are associated with LQTS [
50]. Flanagan et al., found that both SIFT and PolyPhen-2 had more success predicting loss-of-function compared to gain-of-function mutations [
22] and this could also be the case with the other three
in silico tools used here, hence the low MCC and AUC scores.
The protein context (isoform) of SNVs in SCN5A highlights an important issue when reporting results for diagnostic tests using in silico prediction tools. These tools make predictions based only on amino acid sequence and protein structure information, and very little information about protein function. Caution should be used when making clinical diagnosis based solely on predictive results. As demonstrated by the two isoforms of SCN5A, a single amino acid difference can have a significant effect in terms of protein function and interactions between the ion channels and their accessory proteins. The ideal prediction tool should incorporate information of not just the amino acid sequence and protein structure, but also the protein’s function and interaction with other proteins. While some of the latter aspects are included in the SNPs&GO and SNAP programs, more research is required to resolve protein function and interactions in order for these to be incorporated into prediction tools. Therefore, in silico predictions should act as an indicator of whether a variant of unknown significance is pathogenic or benign, and if functional studies are available, clinical information should be used to characterise the variant.
Compared to the different combinations of
in silico tools, the metaservers were not significantly better despite their claims of improved performance over individual integrated tools [
30,
31]. Both Meta-SNP’s and PredictSNP’s performance were comparable to many of the different combinations of
in silico prediction tools, with Meta-SNP performing slightly better. The metaservers performed better than individual
in silico tools; however, compared to the combination of tools with the best performance the metaservers’ accuracy, sensitivity and MCC scores were not as good. Despite this the metaservers had high specificity compared to some of the
in silico tools.
A major limitation of this study is the low number of benign SNVs for all three LQTS genes. Attempts to address this deficiency by including polymorphisms that have not been functionally characterised only led to a marginal increase in benign SNVs. Therefore, the analysis of the ability of the in silico tools to correctly predict an SNV to be benign may not be as reliable as the analysis of the tools’ ability to correctly predict an SNV to be pathogenic.
Another limitation is that in the current study only SIFT, PolyPhen-2, PROVEAN, SNPs&GO, SNAP, Meta-SNP and PredictSNP were investigated in the prediction of SNVs for LQT 1–3 genes; however, it is not to say that these are the only programs that are effective for these genes. For both
KCNQ1 and
KCNH2, sequence and evolutionary conservation-based
in silico prediction tools appear to work best as demonstrated by the success of PROVEAN, SIFT and SNPs&GO. The results for
SCN5A and an overall combination of
in silico tools for all three genes did not appear promising. However, for
SCN5A a combination of sequence and evolutionary conservation-based method with a supervised-learning method that uses a wide-spread method may be the best choice. In this study, SNAP appeared to work better than the other five
in silico tools in analysing
SCN5A gene variants. SNAP is a supervised-learning method using neural networks to make predictions, and it incorporates evolutionary constraints, structural features and protein annotation information [
19].
Authors’ contributions
IUSL and DL were responsible for the acquisition of data, and the original analysis. AS was responsible for the statistical analysis and the data analysis of the revised manuscript. All authors have made substantial contributions to conception and design and interpretation of data. All authors have been involved in drafting the manuscript or revising it critically for important intellectual content. All authors have given final approval of the version to be published. All authors agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.