INTRODUCTION

Type 2 diabetes mellitus (T2DM) has been a threatening public health risk as it gains increased prevalence. Vigorous efforts have been made to identify genetic factors for the susceptibility to T2DM using genetic association analyses. Especially, several genome-wide association studies were recently conducted.1, 2, 3, 4, 5 Nevertheless, their results revealed inconsistencies in associations of many nucleotide sequence variants.6 A serious publication bias was suspected because a fewer publications with negative results were likely to be reported from candidate gene association analyses. Another concern was that retrospective studies (eg, case–control study) might have resulted in some genetic effects confounded with other effects. More prospective studies based on genome-wide association analysis should be replicated to determine whether such inconsistencies were caused by genetically different underlying population structures and various environmental exposures or by false findings. This study was aimed to examine genetic associations of previously identified sequence variants with T2DM using a large-scale Korean cohort data.

Incidence rates of T2DM have been known different between males and females, but the diversity varied among studies.7 In the second vein of the current study, different genetic effects were examined to explain the heterogeneous incidence rate of T2DM by gender.

METHODS

Subjects and data

The Korea Association REsource (KARE) Analysis Consortium was established to understand the human genetic basis by conducting a large-scale genome-wide association study. It has cohort data of 10 038 unrelated Korean individuals collected by the Korean Genome Epidemiology Study (KoGES). The data collection was initiated in 2001, and thereafter follow-up examinations for each participant have been conducted for every 2 years. Genotypic data of the KARE were obtained using the Affymetrix Genome-Wide Human SNP Array 5.0 (Affymetrix, Inc., Santa Clara, CA, USA). For details, see Cho et al.8 An underlying set of unphased genotypes for each individual in the cohort were imputed with the Japanese and Chinese HapMap phase 2 haplotype panel using IMPUTE software program (version 2, http://mathgen.stats.ox.ac.uk/impute). A total of 8842 individuals were from the Ansung (2374 men and 2263 women) and Ansan (1809 men and 2396 women) population-based cohorts in Gyeonggi Province. The data were obtained after screening by genotype calling and quality control.8 However, eight individuals without phenotype or covariate information were excluded in the current analysis.

Mean age of the remained 8834 subjects was 52.2±8.9 years and their mean BMI was 24.6±3.1 kg/m2. Six hundred thirteen out of 8834 subjects were self-reported as the patients with T2DM, and they were considered as patients diagnosed with T2DM based on the ADA criteria fasting plasma glucose ≥126 mg/dl or 2-h plasma glucose ≥200 mg/dl. The other subjects in the cohort were all used as controls. The characteristics of the patients were compared with those of controls by gender in Table 1.

Table 1 Characteristics of subjects studied in the current study

Marker selection

We analyzed the genetic association of T2DM with SNP markers identified by previous studies. The SNPs were selected based on peer-reviewed scientific publications using SNPedia (http://www.snpedia.com), a wiki-based database of SNPs associated with human diseases. The selection criteria were the associations previously reported from multiple studies, or from a study with a meta-analysis, with a large consortium (at least 500 patients), or with a multi-laboratory consortium. The associations of the SNPs with susceptibility to T2DM were all confirmed through original scientific articles that identified the associations.

Statistical analysis

Genotypic association of each SNP with susceptibility to T2DM was tested by χ2 statistics with two degrees of freedom. The analytical model included age, gender, and BMI as covariates. Threshold of false-positive error in the significance test was 0.05 and Bonferonni multiple testing corrections were introduced to correct for occurrence of false positives. The association analysis was further conducted with data partitioned by gender with adjustment for age and BMI. All the association analyses were conducted using PLINK (version 1.06, http://pngu.mgh.harvard.edu/purcell/plink) and SPSS (version 12.0, SPSS Inc., Chicago, IL, USA) software programs.

RESULTS

We obtained 41 previously identified SNPs using SNPedia. Twenty-four out of 41 SNPs were included in this genome-wide association study using the Korean cohorts, and 10 SNPs with imputed genotypes were additionally used in the current association analysis. As rs12255372 in TCF7L2 gene was monomorphic, it was excluded in the current association analysis. A total of 33 SNPs were analyzed, and none of them were deviated (P>0.05) from Hardy–Weinberg equilibrium except for rs12304921. The association analysis revealed nine SNPs associated with the susceptibility of T2DM (P<0.05) and five SNPs after Bonferonni correction (P<0.0015, Pcorr<0.05, Table 2). One was an intergenic sequence variant on chromosome 10 and located ∼10 kb apart from 3′-end of hematopoietically expressed homeobox (HHEX) gene. The other four SNPs were all located within intron 5 of cyclin-dependent kinase 5 (CDK5) regulatory subunit-associated protein 1-like 1 (CDKAL1) gene.

Table 2 Associations of previously identified SNPs with T2DM susceptibility in Koreansa

A further analysis with the significant SNPs showed heterogeneous results with data partitioned by gender (Table 3). There were three sequence variants of rs7756992 (CDKAL1), rs9465871 (CDKAL1), and rs5015480 (HHEX) significantly associated with the susceptibility of T2DM in females (P<0.005, Pcorr<0.05). On the other hand, no significant variants were observed in males (Pcorr>0.05).

Table 3 Associations of SNPs with T2DM susceptibility by gendera

DISCUSSION

The current replication study revealed associations of five previously identified sequence variants with the susceptibility of T2DM. Especially, they were all located within CDKAL1 gene except for an intergenic variant of rs5015480 near HHEX gene. The association of the CDKAL1 gene concurred with the results from previous studies with American,1 British,5 French,3 Danish,4 Finnish,2 Swedish,1 Icelandic,4 Korean,9 Japanese,10 and Chinese11, 12, 13 populations. The significantly associated variants of the gene in the current study were specifically corresponding to those found in previous studies of Horikawa et al14 and Wu et al11 for rs7756992 and Zeggini et al5 and Wu et al11 for rs9465871 and rs10946398, respectively. The replicated results strengthened the finding that the gene and its sequence variants conferred risk of T2DM. This could be explained by the function of the CDKAL1 on insulin secretion. The CDKAL1 has a domain similar to CDK5 regulatory subunit-associated protein 1 (CDK5RAP1), a neuronal protein that specifically inhibits activation of CDK5.15 The reduced expression of CDKAL1 enhances activity of CDK5 in β cells and thus decreases insulin secretion.16 The function of the CDKAL1 on insulin secretion would be influenced by alternative splicing. A variety of modified functional sites in splicing process were predicted by allelic substitutions of its sequence variants (Supplementary Data), and this in silico prediction supported the putative roles of the sequence variants on insulin secretion.

The fact that rs5015480 was located outside known protein-coding sequences essentially tells little about its function. Nevertheless, its potential function of regulating the HHEX gene was suspected from previous genome-wide association studies in which several variants including the rs5015480 in and around the gene have been associated with T2DM of Europeans.2, 3, 5 The associations were replicated also in the Japanese17 and the Chinese12 populations. Furthermore, other variants of the gene were associated with impaired pancreatic β-cell function and thus with decreased insulin secretion.18, 19

A further analysis showed associations of three (rs7756992, rs9465871, and rs5015480) out of the five significant SNPs with the susceptibility of T2DM in females. No associations were observed in males. The most significant two SNPs were sequence polymorphisms located in the CDKAL1. The heterogeneity by gender was a novel finding for T2DM. The larger effect of the CDKAL1 in females concurred with the study of Steinthorsdottir et al,4 showing a larger association of rs7756992 with insulin secretion in females (P<0.00001) than in males (P<0.001). The differential effect would lead to a deflated association by adding males in the association analysis. As an example, such deflation was suspected in a nominal effect resulted with a large ratio of males to females (1.5 for both patients and control subjects).10 The genetic heterogeneity by gender might be attributed to sexual hormones which considerably influenced insulin sensitivity and secretion.20, 21 Especially, as estrogen could affect the susceptibility of T2DM22 and regulate the activity of cdk5 in adult rat uterus,23 it could serve as an important candidate resulting in the gender-specific effect of CDKAL1 on T2DM susceptibility. Also, estrogen could enhance serum level of thyroid hormone24 whose production was regulated by HHEX, a crucial transcription factor.25 A potential interactive action of HHEX with the sex hormone might produce the heterogeneous genetic effect of HHEX on T2DM susceptibility by gender.

The current study provided the first evidence of a heterogeneous association by gender between susceptibility of T2DM and the two genes, CDKAL1 and HHEX. The identified genetic variants conferring risk for the disease were the commonly present SNPs, which can be a great concern on practice of medical application. Functional studies on the heterogeneous association are warranted to elucidate their underlying mechanisms.