Background
Familial hypercholesterolemia (FH) is an autosomal-dominant genetic disorder characterized by elevated plasma low-density lipoprotein cholesterol (LDL-C) levels, with a prevalence between 1:250 and 1:500 across different world populations [
1‐
4]. When left untreated, FH increases the risk of premature coronary artery disease (CAD), with an estimated 20% of myocardial infarctions (MIs) in patients aged under 45 years attributable to FH [
2]. FH should be suspected in adults with LDL-C > 4.9 mmol/L (190 mg/dL) and children with levels > 4 mmol/L (160 mg/dL) combined with a family history of premature CAD [
5‐
7]. There are three formal diagnostic criteria widely used to diagnose FH: the Dutch Lipid Clinic Network (DLCN) [
8,
9], Simon Broome [
10], and Make Early Diagnosis to Prevent Early Death (MEDPED) criteria [
11]. Of these three sets, the DLCN and Simon Broome criteria rely on genetic variations present in FH causing genes combined with other clinical features.
To date, pathogenic variants causing FH are predominantly reported in three genes: low-density lipoprotein receptor (
LDLR), 95%; apolipoprotein B (
APOB), 2–11%; and proprotein convertase subtilisin/kexin type 9 (
PCSK9), 1% [
1,
8,
9,
12]. Also, some recessive genes have been associated with FH, including Low-Density Lipoprotein Receptor Adaptor Protein 1
(LDLRAP1), ATP Binding Cassette Subfamily G Member 5 (
ABCG5), ATP Binding Cassette Subfamily G Member 8 (
ABCG8), and Lipase A, Lysosomal Acid Type (
LIPA).
Polygenic inheritance is the most likely cause of disease in patients with a clinical diagnosis of FH without detectable variants in the
LDLR,
APOB, and
PCSK9 genes (variants in the novel genes were observed only in few cases) [
13]. In 2013, Talmud et al. developed a 12-SNP LDL-C “SNP-Score” based on common variants identified in genome wide association studies that were associated with increased LDL-C levels [
13,
14]. Validation of this score in European-Caucasian population has shown that 80% of the clinically diagnosed FH patients with no detectable mutations in
LDLR, APOB,and
PCSK9 have a polygenic inheritance [
13].
Although FH is primarily caused by dominant variants; rare cases have been found to harbor homozygous variants (prevalence 1:160,000–1:300,000) [
9]. The incidence of homozygous FH (HoFH) is increased in Middle Eastern countries due to the high degree of consanguinity. For example, the homozygous
LDLR allele (p.C681X) is responsible for 60% of FH cases in Lebanon [
15]. There is another form of HoFH caused by biallelic variants in the
LDLRAP1 gene, termed autosomal recessive hypercholesterolemia (ARH). ARH was first described by Khachadurian and Uthman in Lebanese families in 1973 [
16] with a global prevalence of less than 1 in 1 million [
17]. However, ARH is found more commonly on Sardinia Island in Italy due to founder effect and inbreeding. About 100 ARH patients have been reported so far, most of them from Sardinian Island [
18]. The prevalence of ARH in Sardinian Island was estimated to be 1 in 40,000, and the frequency of heterozygous carriers is 1:143 [
17]. ARH is also characterized by a severe elevation in the LDL-C levels, tendon xanthomas, and premature CAD [
19]. Half of the ARH patients reported have evidence of CAD [
18]; however, no ARH patients with premature CAD have been reported before 20 years old [
20].
A recent census of FH cases in the Arabian Gulf (Kuwait, Oman, Qatar, Saudi Arabia, and the United Arab Emirates) showed 130,693 heterozygous carriers and 87 HoFH cases [
21]. Notably, the EAS Familial Hypercholesterolaemia Studies Collaboration (FHSC) reported 57 FH genetic variants in 17 Middle Eastern and North African countries, while none were identified in Qatar [
21]. Similarly, Alhababi and Zayed (2018) reported that no FH-related genetic variants had been found in 14 Arab countries, including Qatar [
22]. Thus, the identity and prevalence of FH variants in the Qatari population have not been well established.
In the present study, a large Arab population biobank has been utilized to assess the genetic burden of FH in a systematic and large-scale manner, which may serve as a reference dataset for future studies of FH in the region. We conducted the large-scale characterization of FH alleles in any Arab population, using a whole-genome sequencing (WGS) dataset of 6,140 adult participants from Qatar Genome Program (QGP). We used the extensive phenotypic data from Qatar BioBank (QBB) for the FH diagnosis of 6,140 participants using DLCN criteria. We assessed the presence of known pathogenic variants in LDLR, APOB, PCSK9, LDLRAP1, ABCG5, ABCG8, and LIPA in these individuals and evaluated novel variants of these genes for pathogenicity. Furthermore, we tested the utility of globally established 12 SNP LDL-C SNP scores for predicting polygenic FH risk in Arab populations.
Discussion
Familial hypercholesterolemia is the most common genetic cause of premature CAD [
47] caused mainly by genetic variants in
LDLR,
APOB and
PCSK9 genes. Although the global prevalence is estimated to be between 1:250 to 1:500, knowledge of FH variants and its prevalence in the Middle East region has not been well established due to the lack of local or national registries [
48]. By using DLCN criteria, we identified 0.1% [
8] definite, 0.7% [
41] probable, and 5% (334) possible FH individuals. This suggests a estimated prevalence of ‘definite or probable’ FH individuals in the QBB cohort of 1:125 (0.8%). The findings are comparable with those of Gulf FH registry study (Saudi Arabia, Oman, United Arab Emirates, Kuwait, and Bahrain) of 34,366 patients, which estimated a prevalence of FH (definite or probable) of 1:232 (0.43%) in the region. [
47].
Studies indicate that 60–80% of those with a clinical diagnosis of ‘definite’ FH and 30% of ‘possible’ FH individuals have pathogenic variants in at least one of the three FH-causing genes [
49]. However in our cohort, we observed the FH variants in 12% of the ‘definite or probable’ FH and 0.6% of possible FH. The mutation rates observed in our study were low compared to those observed in lipid clinic patients, such as 63–80% for definite FH individuals (DLCN criteria) [
50‐
52]. A possible explanation may relate to the bias in referrals of patients with severe phenotypes to lipid clinics as compared to individuals with FH in the general population. Nevertheless, a community-based study, such as the Copenhagen general population study, reports mutation rates of 7.3% among those with ‘definite or probable’ FH and 1.2% among those with possible FH, in comparison to 12% and 0.6% for ‘definite or probable’ FH and possible FH, respectively, in our study.
Leveraging the WGS data from QGP, we identified ten ClinVar P/LP variants, 14 novel predicted pathogenic SNVs and a novel CNV in
PCSK9 among the 6,140 participants. The genetic architecture of the QGP participants relative to the world population reveals five major ancestries, which include general Arabs (QGP_GAR), peninsular Arabs (QGP_PAR), Arabs of Western Eurasia and Persia (QGP_WEP), South Asians (QGP_SAS), and Africans (QGP_AFR) [
30]. The LOF variant (c.313 + 3 A > C) in the
LDLR gene has been identified as the most common FH causing variant in Qatar and is found in six heterozygous individuals who all belong to the QGP_PAR subcluster (QGP_Penisular Arabs). Given the uniqueness of this variant to this relatively ancient and isolated genetic subgroup, it is likely that it has risen as a result of founder effect. This also implies that this variant may be unique to the Arab population, which is further supported by its absence from population databases. Despite the high degree of consanguinity [
53], no homozygous individuals carrying known P/LP variants in the three candidate genes (
LDLR, APOB, and
PCSK9) were identified in the QGP cohort. This might be due to the severity of homozygous FH such that affected individuals do not survive past the second decade of life without treatment due to the very early risk of CAD. Also, the global prevalence of HoFH was estimated between 1:160,000 to 1:300,000 [
9].
The cataloging of FH pathogenic variants and diagnostic classification of QBB participants allowed us to estimate the clinical penetrance of previously reported pathogenic variants in clinical databases. For the 28 variants annotated as disease-causing (DM) in the HGMD, for example, we observed complete penetrance for only four variants, incomplete penetrance (range: 6-67%) for six variants and remaining 18 DM variants had zero penetrance. Conversely, all three out of four ClinVar P/LP variants had high penetrance (≥50%). DM variants with zero penetrance might be attributable to: (i) the lack of sufficient carriers to estimate the actual estimated clinical penetrance or (ii) the possibility of false positives in the HGMD database [
54].
A novel whole gene duplication of the
PCSK9 was observed in an individual with high LDL-C level (6.03 mmol/L). This is consistent with a previous report of two cases with an entire
PCSK9 duplication causing severe FH [
55]. Structural mapping of the predicted pathogenic novel SNVs in
PCSK9 (p. Arg303His, p. Ala68Asp, p. Gly59Arg) and
LDLR (p. Asp472Asn) suggest that they are positioned in functionally critical regions of the
PCSK9 and
LDLR proteins, respectively.
In the QGP cohort, homozygous individuals carrying recessive FH variants were observed in
LDLRAP1 and
ABCG8 genes. Among the two
LDLRAP1 variants, the variant (p. Ser202Tyr) was among the first six mutations identified in the
LDLRAP1 gene by Garcia et al. (2001) in a Lebanon family, which was described as the ARH4 allele [
56]. Two sisters from Lebanon, aged 7 and 17, carry this mutation with LDL-C levels of 10.1 mmol/L and 13.4 mmol/L, respectively. The siblings who carry the ARH4 allele also have a family history of CAD, and the father died at the age of 28 from myocardial infarction [
56]. A total of three homozygous individuals carrying this variant have been reported in the QGP cohort. All three homozygous individuals were self-reported for hypercholesterolemia, two of them were undergoing treatment with cholesterol lowering medications and one with diet management. Although homozygous individuals carrying this variant found in population databases (GME, gnomAD) might suggest the variant has a low/incomplete penetrance, we have observed that three homozygous individuals carrying this variant have been diagnosed with hypercholesterolemia, and two of them have undergone heart revascularization surgery.
The homozygous individual carrying the second LDLRAP1 variant (NP_056442.2: c.200 C > T; Ser67Leu) was a 36-year-old male who had been diagnosed with hypercholesterolemia at the age of 31 and had undergone heart revascularization surgery. The parents of this homozygous individual were reportedly first cousins. He has been treated with cholesterol-lowering medications. There is no other co-morbidity, such as obesity, hypertension, or diabetes mellitus, reported by the participant. Furthermore, no homozygous individuals carrying this variant have been reported in gnomAD or GME. Pathogenic prediction tools indicate that this variant may be deleterious and is in the mutational hotspot of the protein, more specifically, in exon 2 of the PTB/PID domain, which is necessary for the LDLRAP1 protein to bind to the NPXY motif present in the cytoplasmic tail of the LDL receptor.
We found one homozygous individual and 4 heterozygous carriers carry the known pathogenic
ABCG8 variant (p. Gly574Arg). The homozygous individual carrying the
ABCG8 variant (p. Gly574Arg) was a 47-year-old male who self-reported hypercholesterolemia and was treated with cholesterol-lowering medications and diet management. His parents were reported to be first cousins. A LDL-C level of 6 mmol/L was reported for this participant along with a total cholesterol level of 8 mmol/L, triglyceride level of 2.1 mmol/L, and HDL-C level of 1.03 mmol/L; however, his plant sterol level could not be determined because QBB does not have these data. While he has not had premature coronary artery disease, he has a family history of coronary artery disease and his father died of a heart attack. Other comorbid conditions include obesity with a BMI of 25.9, but no diabetes or hypertension was noted. This mutation was identified previously in a large Amish family in which a 13-year-old boy died of coronary atherosclerosis [
57,
58]. Five of his twelve siblings developed tendon and tuberous xanthomas, as well as increased plasma plant sterols, particularly β -sitosterol.
While there are no published data regarding the prevalence of Sitosterolemia 2 [
59], it appears to be more common in Caucasians [
59,
60]. In contrast, Sitosterolemia 1 caused by
ABCG5 is more prevalent in Indians, Chinese, and Japanese [
59]. Based on LOF variants identified in the ExAC database, the global prevalence of Sitosterolemia 2 is estimated to be at least 1 in 360,000 and 1 in 2.6 million for Sitosterolemia 1 [
59]. The prevalence of Sitosterolemia 2 in QGP was 1:6140, which is high in comparison with the estimated global prevalence of 1:360,000.
The
LIPA variant (p. Thr288Ile) found in one heterozygous carrier was associated with childhood onset Lysosomal Acid lipase Deficiency (LAL-D) (previously known as cholesteryl ester storage disease (CESD)). This variant was reported already in an Italian child in a homozygous state with the age of onset being 2 and showed the clinical characteristics of hepatosplenomegaly, dyslipidemia, and elevated transaminases [
61].
Predicting the cause of clinical FH, whether monogenic or polygenic, can help clinicians to select the most effective and inexpensive lipid-lowering medications, representing the best example of the use of genetic information in precision medicine [
13]. We investigated the 12 SNPs LDL-C raising scores, which the Bristol Genetics laboratories currently use in the UK for genetic screening of patients with a clinical diagnosis of FH [
62]. We observed that 90% of mutation negative ‘definite or probable’ FH individuals had SNP scores within the top three quartiles of the unlikely FH individuals SNP score distribution, thus suggesting polygenic cause. This finding correlates with previous study in a European-Caucasian population, which concluded that 80% of mutation-negative clinically diagnosed FH patients have a polygenic inheritance as an explanation for their high cholesterol [
13,
34]. Further, we observed that ‘definite or probable’ FH individuals, and ‘possible’ FH individuals had significantly higher LDL-C SNP scores than ‘unlikely’ FH individuals. Our results confirm the hypothesis that individuals at risk of hypercholesterolemia are highly expected to carry common LDL-C-raising alleles and might have polygenic inheritance. Further, we demonstrate that the 12-SNP LDL-C SNP score can be used to assess polygenic risk in Arab populations, although these SNPs are derived from Caucasians.
Despite the important findings of our study, there are some limitations. It should be noted that QBB phenotypic data lacks clinical features, such as tendon xanthomas or corneal arcularis, in the participants and the first-degree relatives, which are usually assigned higher scores in the DLCN criteria. However, the same limitations were also observed in other general population studies, such as the Copenhagen study, while using DLCN diagnostic criteria for FH diagnosis in 98,098 participants [
1].
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.