Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

GWAS and PheWAS of red blood cell components in a Northern Nevadan cohort

  • Robert W. Read,

    Roles Conceptualization, Formal analysis, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Applied Innovation Center, Renown Institute for Health Innovation, Desert Research Institute, Reno, NV, United States of America

  • Karen A. Schlauch,

    Roles Conceptualization, Formal analysis, Project administration, Visualization, Writing – original draft, Writing – review & editing

    Affiliation Applied Innovation Center, Renown Institute for Health Innovation, Desert Research Institute, Reno, NV, United States of America

  • Gai Elhanan,

    Roles Data curation, Writing – review & editing

    Affiliation Applied Innovation Center, Renown Institute for Health Innovation, Desert Research Institute, Reno, NV, United States of America

  • William J. Metcalf,

    Roles Data curation, Resources

    Affiliation Applied Innovation Center, Renown Institute for Health Innovation, Desert Research Institute, Reno, NV, United States of America

  • Anthony D. Slonim,

    Roles Conceptualization, Funding acquisition, Writing – original draft, Writing – review & editing

    Affiliation Renown Health, Reno, NV, United States of America

  • Ramsey Aweti,

    Roles Methodology

    Affiliation 23andMe, Inc., Mountain View, CA, United States of America

  • Robert Borkowski,

    Roles Methodology

    Affiliation 23andMe, Inc., Mountain View, CA, United States of America

  • Joseph J. Grzymski

    Roles Conceptualization, Funding acquisition, Methodology, Project administration, Writing – review & editing

    joeg@dri.edu

    Affiliations Applied Innovation Center, Renown Institute for Health Innovation, Desert Research Institute, Reno, NV, United States of America, Renown Health, Reno, NV, United States of America

Abstract

In this study, we perform a full genome-wide association study (GWAS) to identify statistically significantly associated single nucleotide polymorphisms (SNPs) with three red blood cell (RBC) components and follow it with two independent PheWASs to examine associations between phenotypic data (case-control status of diagnoses or disease), significant SNPs, and RBC component levels. We first identified associations between the three RBC components: mean platelet volume (MPV), mean corpuscular volume (MCV), and platelet counts (PC), and the genotypes of approximately 500,000 SNPs on the Illumina Infimum DNA Human OmniExpress-24 BeadChip using a single cohort of 4,673 Northern Nevadans. Twenty-one SNPs in five major genomic regions were found to be statistically significantly associated with MPV, two regions with MCV, and one region with PC, with p<5x10-8. Twenty-nine SNPs and nine chromosomal regions were identified in 30 previous GWASs, with effect sizes of similar magnitude and direction as found in our cohort. The two strongest associations were SNP rs1354034 with MPV (p = 2.4x10-13) and rs855791 with MCV (p = 5.2x10-12). We then examined possible associations between these significant SNPs and incidence of 1,488 phenotype groups mapped from International Classification of Disease version 9 and 10 (ICD9 and ICD10) codes collected in the extensive electronic health record (EHR) database associated with Healthy Nevada Project consented participants. Further leveraging data collected in the EHR, we performed an additional PheWAS to identify associations between continuous red blood cell (RBC) component measures and incidence of specific diagnoses. The first PheWAS illuminated whether SNPs associated with RBC components in our cohort were linked with other hematologic phenotypic diagnoses or diagnoses of other nature. Although no SNPs from our GWAS were identified as strongly associated to other phenotypic components, a number of associations were identified with p-values ranging between 1x10-3 and 1x10-4 with traits such as respiratory failure, sleep disorders, hypoglycemia, hyperglyceridemia, GERD and IBS. The second PheWAS examined possible phenotypic predictors of abnormal RBC component measures: a number of hematologic phenotypes such as thrombocytopenia, anemias, hemoglobinopathies and pancytopenia were found to be strongly associated to RBC component measures; additional phenotypes such as (morbid) obesity, malaise and fatigue, alcoholism, and cirrhosis were also identified to be possible predictors of RBC component measures.

Introduction

The complete blood count (CBC) is a widely used medical diagnostic test that is a compilation of the number, size, and composition of various components of the hematopoietic system. Abnormal CBC measures may indicate illness or disease. Mean corpuscular volume (MCV), platelet count (PC), and mean platelet volume (MPV) are specific CBC characteristics (hereby called RBC components), and linked to complex disorders such as anemia, alpha thalassemia and cardiovascular disease [15]. Platelets are involved in vascular integrity, wound healing, immune and inflammatory responses, and tumor metastasis; the role of platelets is also paramount in hemostasis and in the pathophysiology of atherothrombosis and cancer [612]. Additionally, abnormally high mean platelet volumes (MPV) are considered a predictor of post event outcome in coronary disease and myocardial infarction [13].

Furthermore, studies have shown that individuals living in higher altitudes have noted differences in red blood cell components than at sea level. At approximately 4,400 feet above sea level, Northern Nevada, where this study is conducted, is considered a high desert in the Sierra Nevada foothills. Alper showed that mean platelet volume (MPV) is 7.5% higher at altitudes greater than 4,000 feet than at sea level [14]. Similarly, Hudson showed a notable and statistically significant positive correlation with platelet counts (PC) and altitude [15], while mean corpuscular volume (MCV) was recorded as lower at higher altitudes than at sea level [16]. As RBCs help transport oxygen throughout the entire body, the identification of RBC-related genotypic mutations, especially in an RBC high-turnover environment is valuable. Lastly, the identification of genomic regions with roles in megakaryopoiesis and platelet formation, as well as neoplastic conditions like polycythemia vera and essential thrombocytosis (ET) [17,18], may help identify those that have a higher risk of certain complex RBC diseases.

Given the importance of these three RBC components, we conducted a study to identify both genetic and phenotypic associations with all three characteristics via GWASs and PheWASs. Our study begins with the Healthy Nevada Project, a single cohort formed in 2016 to investigate factors that may contribute to health outcomes in Northern Nevada. Its first phase provided 10,000 individuals in Northern Nevada with genotyping on the 23andMe 2016 Illumina Human OmniExpress-24 BeadChip platform at no cost. Renown Hospital is the largest hospital in the area, and 75% of these 10,000 individuals are cross-referenced in its extensive EHR database.

As noted above, previous GWASs have identified significant genetic links with all three RBC components we examine in this study, MPV, MCV and PC [13,1745]. Lin et al. 2007 identified a strong genetic link with MCV in region 11p15 using the Framingham cohort [19]; Kullo et al. 2010 leveraged EHR data from the Mayo Clinic to detect four genes strongly associated with at least one of the three RBC components [27]. Similarly, a number of regions were linked with PC in an African American cohort [35] and MPV [35]; Shameer detected five regions associated with PC and eight with MPV [18].

Our study first performed a genome-wide association study (GWAS) of 4,673 genotyped Northern Nevadans who have at least one recorded value for one of the three RBC components MPV, MCV and PC to examine the genetic component of these components. We found 38 SNPs to be statistically significantly associated (p<5x10-8) to one of the three RBC components. Many of these associations were previously reported, yet our study did identify nine novel SNPs in six different regions. While there were few new associations discovered in our cohort, we identified several SNPs that fall within genes influencing megakaryocytes maturation, platelet volume, platelet signaling and diseases such as anemia. Further, with extensive linked electronic medical record (EMR) data, we had the ability to perform a PheWAS of 1,488 standard lab results (phenotypes) against each SNP found to be associated to RBC components in the Northern Nevadan cohort to examine pleiotropy. Additionally, we then examined the RBC components phenotypically, using linked electronic medical record (EMR) data to determine the relationship between measures of each component and a variety of clinical conditions recorded in patients. Many relevant and strongly statistically significant associations were identified, especially with hematologic components; other traits not currently shown to be linked to RBC components, such as obesity, alcoholism and cirrhosis, were also detected.

Results

Characteristics of cohort

We examined 4,673 genotyped individuals with at least one recorded RBC measure; 4,563 individuals in the cohort had measures for all three components. Table 1 describes the cohort with respect to gender, age, ethnic origin, and standardized value of each RBC component. Note that all values for each component were standardized to the most current lab test administered for that component via linear transformation. Normalization of test values was necessary as lab tests were updated across the 13 years of data collection. The normal (healthy) reference values to which all individual records were standardized are also presented in Table 1. The mean standardized RBC component values for each individual are available in S1 Table.

GWAS of RBC components

After SNP quality control, there were 498,916 high-quality SNPS and 4,627 participants in the MCV cohort utilized for associations studies with mean autosomal heterozygosity of 0.321. The same quality control process yielded 4,564 participants for MPV with the same mean autosomal heterozygosity of 0.321. Similarly, the PC cohort consisted of 4,673 participants with same mean autosomal heterozygosity. Using the average measures of each individual’s MPV, PC and MCV lab records, a standard GWAS under the additive model with adjustments for gender, age and the first four principal components was performed using PLINK 1.9. Genomic inflation coefficients (lambda) were computed for each cohort: 1.031 for MPV, 1.027 for PC, and 1.045 for MCV.

Any SNP with an association p-value of p<5x10-8 was considered a statistically significant association, following current standards [28,32,46,47]. The percentage of phenotypic variance attributed to genetic variation was computed with a combination of PLINK and GCTA [48]: genetic variance was 35.3% for MCV; 32.2% for MPV; 20.7% for PC. The three individual GWAS studies identified a total of 38 SNPs that associated with a RBC component with statistical significance. Manhattan plots of the three GWAS results are presented in S1A–S1C Fig). As an example for the reader, we include in the manuscript (Fig 1), a Manhattan plot for MCV.

thumbnail
Fig 1. MCV GWAS Manhattan plot.

Genome-wide association study results for MCV. The x-axis represents the genomic position of 498,916 SNPs. The y-axis represents -log10-transformed raw p-values of each genotypic association. The red horizontal line indicates the threshold of significance p = 5x10-8.

https://doi.org/10.1371/journal.pone.0218078.g001

MPV

A GWAS was performed on a cohort of 4,564 genotyped participants with MPV laboratory measures. We identified 21 SNPs across five different chromosomal regions that reached genome-wide significance (p < 5x10-8; Table 2). Of these, 13 demonstrated previous associations in at least one other study, with six associated with RBC components (S2 Table) [13,17,18,25,28,30,33,35,4958]. All five significant chromosomal regions were previously associated with MPV[17,18]. The fifth region 18q22.2, contains three SNPs associated in our cohort with average p-value p = 3.86x10-9, however none of the individual SNPs have been previously associated with MPV. Results are presented in Table 2.

MCV

A GWAS was performed on a cohort of 4,627 genotyped participants with MCV laboratory measures. There were 14 SNPS that were significantly associated with MCV (Table 2). These SNPs lie in three chromosomal regions: predominantly in 6q23.3 and 22q12.3. These two regions have detailed annotation and were linked previously with MCV (S2 Table) [20,27,32]. All but four of the SNPs are in non-coding regions. These four SNPs lie in TMPRSS6. The gene TMPRSS6 codes for the protein matriptase-2, which is part of a signaling pathway that regulates blood iron levels [31]. The two SNPS rs855791 and rs4820268 showed the strongest association with MCV (p<1x10-11). These two SNPS also lie in TMPRSS6 and cause a missense and synonymous mutation, respectively. Results are presented in Table 2.

PC

A GWAS was performed on a cohort of 4,673 genotyped participants with PC laboratory measures. Three SNPs were identified with statistically significant (p<5x10-8) links to PC values in our cohort, two of which were previously identified in other studies (S2 Table) [17,25,26,34]. The SNP rs10974808 is in the same cytoband region (9p24.1) as the others but has not been linked to PC. The three SNPs have different effects on PC: rs385893 and rs423955 have negative effect size (β = -7.744 and -7.387, respectively), while rs10974808 has a positive effect (β = 11.490). The minor allele frequency of rs10974808 is much rarer (MAF = 11.48%) compared to 49% for rs385893 and 31.17% for rs423955. Results are presented in Table 2.

Comparison to other GWAS studies

The Northern Nevada cohort had mean standardized MPV values of 10.58 ± 0.98 fL, comparable to levels reported in the Health ABC cohort described in Qayyum (10.9 ± 1.6 fL), and two European cohorts investigated in Geiger (10.53 ± 1.08, 10.83 ± 0.87) [28,35]. The Nevadan cohort had MCV values of 91.53 ± 4.5 fL, also comparable to those described in Kullo (90.5 ± 4.2 fL) and Ding in the Mayo and Johns Hopkins Group Health Cooperative cohorts (90.53 ± 4.17 and 91.56 ± 4.49, respectively), as well as several European cohorts in Geiger (e.g., 91.5 ± 4.2, 91.4 ± 4.41, 91.1 ± 4.44, 92.0 ± 4.3) [27,28,32]. Mean standardized PC values in the Nevadan cohort (251 ± 62.23 K/uL) were very similar to many of the cohorts examined in Geiger (e.g., 258.6 ± 63.1, 252 ±71.7, 250.9 ±64.8, 247 ± 64.7) [28].

Our three GWAS results were in close correlation with many of the other studies. For example, the locus rs7961894 in the WDR66 gene on q24.31 was found associated to MPV in our cohort and in Meisinger as a top hit [24]. Effect sizes in Meisinger were larger than ours (1.03 vs. 0.22), but the number of minor alleles predicted an increase in MPV for both studies. Another SNP, rs342240, was one of our cohort’s top associations with MPV, and was also identified by Shameer and Soranzo as significant links to MPV [17,18]. Similarly, locus rs385893 was identified as a possible predictor of PC by Soranzo and our cohort, with very similar notably large negative effect sizes (-6.24 and -7.74, respectively). Kullo also found SNP rs7775698 to be significantly associated to MCV, with similar positive effect sizes as our study (0.92 vs 0.56) [27]. Soranzo et al. identified rs9402686 as a top link with MCV, and again, effect sizes were similar to ours (0.82 vs 0.65) [17].

ANOVA

The mean component values across genotypes presented in S2 Table correlate with negative and positive effect sizes: SNPs showing a negative effect size have a decrease in component values across the genotypes from left to right (homozygous in major allele, heterozygous, homozygous in minor allele). All ANOVA p-values of the significant SNPs identified in this study are significant, even after a simple Bonferonni correction (.05/38 = 0.001). A box and whisker figure of ANOVA results for the top hit SNP rs7961894 are shown in S2 Fig.

PheWAS of RBC components

The first PheWAS examined possible associations between significant SNPs identified in each RBC trait GWAS and 1,488 phenotypic groups. At significance levels 1x10-4<p<1x10-3, putative associations of MCV-specific SNPs included respiratory failure; those with PC included GERD and other diseases of the esophagus. Our study also showed links with MPV-associated SNPs and skin cancer, hypoglycemia, hyperglyceridemia, IBS, among others. These associations are outlined in S3A–S3C Fig.

The second PheWAS investigated whether the 1,488 phenotype groups were associated with the levels of each RBC component; more specifically, the analysis identified whether the number of cases in a phenotype group was a predictor of the level of the component (Table 3). For example, the PheWAS examining associations of MPV levels presented significant links with thrombocytopenia and purpura (p<1x10-8). Interestingly, Vitamin D deficiency was also shown to be a predictor of MPV levels, although at a lower significance level (p<1x10-6). Incidence of malaise and fatigue was also found to be a potential predictor of MPV in our cohort.

Associations with MCV included hemoglobinopathies and hemolytic anemias (p<1x10-35), as well as iron deficient anemias (p<1x10-20). Again, association with (morbid) obesity was evident (p<1x10-20). Alcoholism and related liver diseases were associated with MCV at a significance level of p<1x10-8; abnormal glucose and diabetes were also linked to MCV at p<1x10-5. We identified a strong association in our cohort between platelet counts and thrombocytopenia and purpura (p<1x10-30). Associations with other hematologic phenotypes such as various anemias and pancytopenia also reached significance (p<1x10-8). Additionally, (morbid) obesity and cirrhosis were statistically significantly associated with PC with p<1x10-8 significance level. These three PheWAS results are shown in S4A–S4C Fig. As an example for the reader, we include the PheWAS results for MCV in Fig 2.

thumbnail
Fig 2. MCV PheWAS plot.

This figure illustrates the results of individual linear regression between incidence of phenotype groups (phecodes) and continuous MCV component measures. The model includes age, gender and ethnicity as covariates. Each point represents the p-value of the association between one of 1,488 phecodes with at least 20 cases assigned to it, and the MCV component measure. The horizontal red line represents the significance level p = 3.4x10-5.

https://doi.org/10.1371/journal.pone.0218078.g002

Discussion

In this study, we first performed three independent GWASs of 4,673 Healthy Nevada Project participants with 500,000 genotypes against the RBC components: platelet count, mean platelet volume and mean corpuscular volume. We followed these with two independent PheWASs for each component to identify additional phenotypic associations with each blood component-significant SNP, and phenotypic associations with measures of each blood component.

Our genome-wide association analysis identified ten different chromosomal cytoband regions associated with at least one RBC component. Nine of those regions were previously associated to RBC components in other studies; the region 22q13.33 represents a novel region in our study [17,18,20,25,27,28,30,32,49,59,60]. Nine genes lie in the cytoband regions: their functions are outlined in Table 4.

Our GWAS results were very similar to previous MPV GWAS associations. The most significant genetic association with MPV (rs1354034, p = 2.39x10-13) is found in an intronic region within ARHGEF3 on chromosome 3p14.3. The gene ARHGEF3 codes for a Rho guanine nucleotide exchange factor 3 protein and was associated to MPV in previous studies [13,17,18,28,33,61], further demonstrating that our study was able to replicate associations with RBC components in prior single-cohort studies. The mechanism by which rs1354034 affects MPV values is still ambiguous. As it lies in a DNase I hypersensitive region within open chromatin, it could directly affect ARHGEF3 expression in human megakaryocytes maturation [61]. Our second most significant association (rs7961894, p = 2.68x10-11) was also previously linked with MPV [13,18,24,28]. This SNP lies in intron 3 of WDR66 on chromosome 12q24.31. Expression levels of WDR66 have been directly tied to MPV, possibly indicating that WDR66 is involved in the establishment of platelet volumes. SNP rs7961894 is not directly correlated with WDR66 expression levels, implying an indirect role possibly through other regulatory mechanisms [24].

We also identified several SNPs on chromosome 10q21.3 to be associated with MPV in our cohort that were linked to sex hormone levels in previous studies [53]. This may imply a possible relationship between sex hormone levels and MPV. These SNPs almost exclusively lie in JMJD1C, a gene that encodes as a probable histone demethylase, and may have a function in hormone-dependent transcriptional activation [17]. This could indicate that the transcription of certain hematopoietic target genes may be enhanced or repressed when specific sex hormones are present; however, the exact targets and mechanisms have yet to be studied and clinical evidence for such association is scant.

Further, the chromosomal region 18q22.2 was shown to be associated with MPV [13], although the significant SNPs in this region have not been linked to MPV in previous studies. Three out of the four SNPs in this region are intronic to CD226, while one is in an untranslated region of DOK6. CD226 codes for a protein, which mediates the binding of activated platelets to endothelial cells and may participate in platelet signal transduction [63]. Soranzo et al. also identified this gene as having a possible role in megakaryocyte (MK) development, thus these SNPs in CD226 may influence platelet development [17]. DOK6 encodes a docking protein, necessary for protein scaffolding, but to our knowledge has no known relation to platelet function; therefore, the functional relevance of a SNP in this gene is ambiguous. The mechanism by which these SNPS within 18q22.2 affect CD226, DOK6 or MPV is also currently unknown.

The majority of SNPs associated with MCV and PC are in non-coding regions, and most were previously associated with these components in previous studies [17,27,32,45]. Our two strongest associations with MCV (rs855791, p = 5.23x10-12) and (rs4820268, p = 2.65x10-11) are in the gene TMPRSS6 and could cause an altered or loss of function for the matriptase-2 protein. Altered function of the protein will likely influence iron status within the body, demonstrating why these SNPS are highly associated with anemia caused by iron deficiency [31,38]. PC was associated with only a single gene in our GWAS. This gene, RCL1, which encodes an RNA terminal phosphate cyclase-like 1 protein, was previously associated with PC [28]. The SNP associated to PC in this gene (rs10974808, p = 3.53x10-09) in our cohort has not been linked to PC by other studies to the best of our knowledge. Our strongest association (rs385893, p = 8.04×10−10) was previously found to affect JAK2, a gene 400 kb downstream of the locus and a key regulator of megakaryocyte maturation, illustrating that these SNPs may influence changes over large genetic regions [17]. This also highlights the difficulty determining the exact mechanisms by which these SNPS alter components, such as RBC, given their large theoretical range of influence.

We present here two comprehensive PheWAS analyses of RBC components. The first examines whether additional phenotypic associations exist between SNPs associated to an RBC component in our cohort. The second groups extensive EHR phenotypic data from the Healthy Nevada Project clinical database into 1,488 different phenotype groups and examines the association (predictive value) between their incidence rate with continuous RBC component values. This second analysis resulted in a number of hematologic phenotypes that associated with RBC component levels (Table 3). To the best of our knowledge, this is the first PheWAS targeted at RBC components. Not surprisingly, many of our strongest associations were with hematopoietic phenotypes, indicating that the incidence of having one (or more) abnormal hematopoietic characteristics is a potential predictor of RBC component levels. Interestingly, the incidence of having vitamin D deficiency may be linked to MPV levels and requires further study, as incident solar radiation in the Northern Nevadan location of the study is high. Also of interest is that MCV and PC levels could be associated to the occurrence of (morbid) obesity, alcoholism and cirrhosis which are linked to poor vitamin D synthesis [65].

The identified associations between the RBD indices and hematopoietic findings and pathologies are mostly expected due to their known physiologic association and reconfirm previously reported findings. Iron deficiency anemia is often microcytic and characterized by reduced MCV [66]. Iron deficiency also affects megakaryocytes and may induce changes in megakaryocyte differentiation as well as increased platelet counts and volume [67]. As noted earlier, one of the strongest associations reported here is in the vicinity of JAK2, a known regulator of megakaryocytes maturation [68].

While thrombocytopenias are clearly synonymous with reduced PC, associated platelet volume and size changes can be used to differentiate between inherited macrothrombocytopenias and idiopathic thrombocytopenic purpura (ITP) [69], thus establishing an association with MPV that may be positive or inverse. While this study demonstrated a strong negative association between PC and purpura, and a positive association with MPV, it is important to note that not all purpuras are necessarily caused by platelet deficiency. However, phenotypic groupings were not specific enough to identify associations with respect to specific etiologies (See S3 Table).

Vitamin D, independently, and in association with platelet activity and increased platelet indices, has been associated with cardiovascular disease [70]. The positive association between vitamin D deficiency and MPV levels is intriguing and follows other findings. Cumhur et al. [71] observed an inverse correlation between vitamin D levels and MPV and hypothesized that this may be due to increased release of proinflammatory cytokines present with vitamin D deficiency. Park et al. also reported an inverse association between PC and MPV and vitamin D levels in adults [72].

Platelet activation, as evidenced by platelet indices, is a recognized phenomenon in metabolic syndrome [73,74]. This study resulted in a positive association between PC and morbid obesity, and a negative association between MCV and obesity and morbid obesity. While previous evidence [75] does not necessarily support all-gender association between obesity and increased platelet counts, our finding may reflect an association between the central obesity of metabolic syndrome and the associated platelet activation of metabolic syndrome. However, the phenotype groups were not specific enough to allow for specific differentiation between obesity types (See S3 Table).

Thrombocytopenia is often observed in chronic liver disease and cirrhosis and platelet activation may play a role in liver regeneration [76,77]. Alcoholism is also associated with thrombocytopenia [78]. However, evidence of an association between liver disease or alcoholism and platelet activation indices is lacking. Moreover, evidence points to platelet function defects in chronic alcoholism [79]. Thus, the negative effect of PC on cirrhosis and positive effect of MCV on cirrhosis, alcoholism, and alcohol-related disorders found in this study is intriguing and merits further confirmation and research.

Materials and methods

The Renown EHR database

The Renown Health EHR system was instated in 2007 on the EPIC system (EPIC System Corporation, Verona, Wisconsin, USA), and currently contains lab results, diagnosis codes (ICD9 and ICD10) and demographics of more than 1 million patients seen in the hospital system since 2005.

Sample collection

Saliva as a source of DNA was collected from 10,000 adults in Northern Nevada as the first phase of the Healthy Nevada Project to contribute to comprehensive population health studies in Nevada. The personal genetics company 23andMe was used to genotype these individuals. using the Orogene DX OGD-500.001 saliva kit [DNA Genotek, Ontario, Canada]. Genotypes are based on the Illumina Human OmniExpress-24 BeadChip platform [San Diego, CA, USA] including approximately 570,000 SNPs.

IRB and ethics statement

The study was reviewed and approved by the University of Nevada, Reno Institutional Review Board (IRB, project 956068–12). Participants in the Healthy Nevada Project undergo written, informed, consent to having genetic information associated with electronic health information in a de-identified manner. Participants were eighteen years of age or older. Neither researchers nor participants have access to the complete EHR data and cannot map participants to patient identifiers. These data are not incorporated into the EHR; rather, EHR and genetic data are linked in a separate environment via a unique identifier as approved by the IRB.

Processing of EHR data

Most cohort participants had multiple RBC recordings across thirteen years; in these cases, the mean age of each participant across those records was computed and later used as a covariate for each component in GWAS and PheWAS analyses. Many of the participants had lab results (for the same RBC component) recorded across different tests with different healthy reference ranges. For example, the 4,627 participants had measurements for MCV with respect to one or more of ten different MCV lab tests and corresponding healthy reference ranges. Many participants had records across several of these ten different tests. Only those tests/reference ranges having records for more than one individual were used in analyses. To standardize the RBC values across different normal reference ranges, a simple linear transform was computed using each test’s reference range and the most recent test’s range. All component measures within each separate test were then transformed into ranges of the most recent via each range’s specific linear transform. The most recent healthy normal reference range for each component is listed in Table 1. Distributions of raw and transformed laboratory test values can be found in S5A–S5C Fig.

Genotyping and quality control

Genotyping was performed by 23andMe using the Illumina Infimum DNA Human OmniExpress-24 BeadChip V4. This genotyping platform (Illumina, San Diego, CA) consists of approximately 570,000 SNPs. DNA extraction and genotyping were performed on saliva samples by the National Genetics Institute (NG1), a CLIA licensed clinical laboratory and a subsidiary of the Laboratory Corporation of America.

Raw genotype data were processed through a standard quality control process [46,47,8082]. SNPs with a minor allele frequency (MAF) less than 0.01 were removed. SNPS that were out of HWE (p-value < 1x10-6) were also excluded. Any SNP with call rate less than 95% was removed; any individual with a call rate less than 95% was also excluded from further study. Two pairs of participants were excluded due to high IBS (Identical by State) in all three cohorts). Additionally, twelve people were excluded due to high autosomal heterozygosity (FDR < 1%). A number of patients (27) were excluded due to diagnoses related to significant blood loss that could possibly lead to anemia, although; this would likely not be related to genetics.

For further data quality control, using the raw genotype data, a principal component analysis (PCA) was performed to identify and account for population-specific variations in allelic distributions of the SNPs. Genotype data were pruned to exclude SNPs with high linkage disequilibrium using PLINK and standard pruning parameters of 50 SNPs per sliding window; window size of five SNPs; r2 = 0.5 [80]. Regression models were adjusted by the first four components, decreasing the genomic inflation factor of all RBC components to λ≤ 1.04, well within standard ranges [17,27,83].

GWAS

Using PLINK v1.9 [84], we performed a simple linear regression analysis with an assumed additive model (number of copies of the minor allele) including age, gender and the first four principal components as covariates to correct for any bias generated by these variables. Standardized values of all three components followed approximate normal distributions (S5A–S5C Fig (row 2)). Total phenotypic variance explained by the SNPs was calculated by first producing a genetic relationship matrix of all SNPs on autosomal chromosomes in PLINK. Subsequently, a restricted maximum likelihood analysis was conducted using GTCA on the relationship matrix to estimate the variance explained by the SNPS.

A simple one-way ANOVA was performed on the mean RBC component values across the three genotypes. The raw p-values associated to the F-test statistic are included in S2 Table. QUANTO [85] was used to calculate power in our study. While our study was understandably underpowered (power < 80%) to detect small effect sizes with very rare variants.

(MAF between 0.01 and 0.03), the MPV cohort had greater than 80% power to detect effect sizes of 0.25 or greater with MAF of 0.02; the MCV cohort was able to detect effect sizes of 0.8 with MAF greater than 0.03, and the PC cohort was well-powered to detect large effect sizes of 11 or greater with MAF as low as 0.01. For MAFs greater than 0.05, we found that the MPV cohort was able to detect effect sizes of 0.60 with MAF of 0.05 at 80% power, and effect sizes of 0.70 at 90% power. The MPV cohort was large enough to detect effect sizes as small as 0.15 with MAFs at 0.05 with 80% power. The PC cohort was well-powered to detect effect sizes of 8.2 at 80% power with MAFs above 0.05. The power of specific combinations of MAF, sample sizes, and effect sizes (n = 4673) can be seen in S2 Table.

PheWAS

The R package PheWAS [86] was used to perform two independent PheWAS analyses. The first examined associations between statistically significant SNPs identified in an RBC GWAS and EHR phenotypes based on ICD9 codes. The second identified associations between RBC levels in our cohort and ICD9-based diagnoses only. ICD9 and ICD10 codes for each individual in the cohort recorded in the Renown EHR were aggregated via a mapping from the Center for Medicare and Medicaid services (https://www.cms.gov/Medicare/Coding/ICD10/2018-ICD-10-CM-and-GEMs.html). A total of 34,555 individual diagnoses mapped to 6,632 documented ICD9 codes. ICD9 codes were aggregated and converted into 1,814 individual phenotype groups (“phecodes”) using the PheWAS package as described in Carroll and Denny [86,87]. Of these, only the phecodes that included at least 20 cases were used for downstream analyses, following Carroll’s protocol [86]: there were 1,488 phecodes with more than 20 cases in each PheWAS. Age, gender, and ethnicity were included in all PheWAS models. The first PheWAS detected associations between statistically significant SNPs (p<5x10-8) identified in each of the three GWASs above and case/control status of EHR phenotypes represented by ICD9 codes. Specifically, a logistic regression between the incidence (number of cases) of each phenotype group (phecode) and the additive genotypes of each statistically significant SNP was performed, using age and gender as covariates. Possible associations of 1,488 phecodes with each previously detected SNP were assessed. The level of statistical significance was computed as a Bonferroni correction for all possible associations per component: p = 0.05/ Np /Ns, where Np is the number of phecodes tested and Ns is the number of SNPs examined in the specific blood component. This significance level is represented by a red line in S3A–S3C Fig.

A second PheWAS, as outlined in Carroll et al. (2014) [86], was performed to examine associations between each of the three quantitative RBC components and the phecode categories. Specifically, a linear regression between the RBC measure and the case/control status of a phecode was performed (with age and gender as covariates) for each of 1,488 phecodes. A single-SNP Bonferroni correction 3.4x10-5 = 0.05/Np (with Np = 1,488) was used to compute the level of statistical significance. Phecodes with association levels p<3.4x10-5 are highlighted in S4A–S4C Fig.

Data availability statement

EHR data

EHR data for the Healthy Nevada cohort are subject to HIPAA and other privacy and compliance restrictions. Mean standardized RBC component values for each individual are available in S1 Table.

GWAS results

To reduce the possibility of a privacy breach, 23andMe requires that the statistics for only 10,000 SNPs be made publicly available. This is the amount of data considered by 23andMe to be insufficient to enable a re-identification attack. The statistical summary results of the top 10,000 SNPs for the 23andMe data are available here: www.dri.edu/HealthyNVProjectGenetics. All column definitions are listed in Table 5.

PheWAS results

Summarized counts of each ICD9 classification and phenotype group (phecode) are presented in S3 Table.

Researchers interested in obtaining underlying de-identified datasets specifically related to this study should contact our Data Availability Team at Craig.Kugler@dri.edu for specific procedures to gain access to these data.

Supporting information

S1 Table. Mean standardized RBC component values.

This table includes mean standardized RBC component values for each individual along with age and gender. Due to the length of this table it can be found online at www.dri.edu/HealthyNVProjectGenetics.

https://doi.org/10.1371/journal.pone.0218078.s001

(PDF)

S2 Table. General SNP table for MPV, MCV and PC.

This table lists the 38 statistically significant SNPs associated to MPV, MCV and PC in our cohort. General information about the SNP such as chromosome location, GWAS p-value, power, genotype, cytoband, ANOVA, and references of associations identified in previous studies are listed.

https://doi.org/10.1371/journal.pone.0218078.s002

(PDF)

S3 Table. Counts of each phecode group.

This table presents the mapping between ICD9 codes and phecodes as presented in Carroll and the R package PheWAS [86] tested in our study, and the number of incidences from the RBC cohort in each phecode group.

https://doi.org/10.1371/journal.pone.0218078.s003

(PDF)

S1 Fig.

(A, B, C): GWAS results for RBC components MPV, MCV and PC. Genome-wide association study results for the three RBC components. The x-axis represents the genomic position of 498,916 SNPs. The y-axis represents -log10-transformed raw p-values of each genotypic association. The red horizontal line indicates the threshold of significance p = 5x10-8.

https://doi.org/10.1371/journal.pone.0218078.s004

(TIFF)

S2 Fig. ANOVA results of SNP rs7961894.

This figure shows the box and whisker diagram for standardized values of MPV of all members in the cohort based on genotype. Mean and standard deviation values for each genotype are CC: 10.54 ± 0.97; CT: 10.74 ± 1.0; TT: 11.21 ± 0.87. The p-value for this ANOVA analysis is p = 8.7x10-12.

https://doi.org/10.1371/journal.pone.0218078.s005

(TIFF)

S3 Fig.

(A, B, C): PheWAS results between RBC component-significant SNPs and phecodes. These three figures show the results of individual logistic regressions between incidence of phenotype groups (phecodes) and SNP genotypes, based on the additive model. Models include age, gender and ethnicity as covariates. Each point represents the p-value of one SNP and one of 1,488 phecodes with at least 20 cases assigned to it. The horizontal red line in each represents the significance level p = 1.60x10-6 for MPV, p = 2.40x10-6 for MCV, and p = 1.12x10-5 for PC.

https://doi.org/10.1371/journal.pone.0218078.s006

(TIFF)

S4 Fig.

(A, B, C): PheWAS results between RBC component and phecodes. These three figures show the results of individual linear regressions between incidence of phenotype groups (phecodes) and continuous RBC component measures. Models include age, gender and ethnicity as covariates. Each point represents the p-value of the association between one of 1,488 phecodes with at least 20 cases assigned to it, and the RBC component measure. The horizontal red line in each represents the significance level p = 1.60x10-6 for MPV, p = 2.40x10-6 for MCV, and p = 1.12x10-5 for PC.

https://doi.org/10.1371/journal.pone.0218078.s007

(TIFF)

S5 Fig.

(A, B, C): Raw and standardized RBC component lab measures. Distribution of raw RBC component values are presented in the first row; distribution of component values upon standardization to the most recent lab test are shown in the second row; the QQ-plot of the standardized values is pictured in the third row.

https://doi.org/10.1371/journal.pone.0218078.s008

(TIFF)

Acknowledgments

We thank Michele Henderson, Toni Curreri and all the ambassadors of the Healthy Nevada Project. We also thank Iva Neveux for her helpful discussions with phenotypic data. We thank Renown Health and DRI marketing and all the folks at 23andMe who helped launch the project.

References

  1. 1. Letcher RL, Chien S, Pickering TG, Laragh JH. Elevated blood viscosity in patients with borderline essential hypertension. Hypertension. 1983;5: 757–762. pmid:6352482
  2. 2. Sharp DS, Curb JD, Schatz IJ, Meiselman HJ, Fisher TC, Burchfiel CM, et al. Mean red cell volume as a correlate of blood pressure. Circulation. 1996;93: 1677–1684. pmid:8653873
  3. 3. Sarnak MJ, Tighiouart H, Manjunath G, MacLeod B, Griffith J, Salem D, et al. Anemia as a risk factor for cardiovascular disease in the atherosclerosis risk in communities (aric) study. J Am Coll Cardiol. 2002;40: 27–33. pmid:12103252
  4. 4. Simone G de, Devereux RB, Chinali M, Best LG, Lee ET, Welty TK. Association of Blood Pressure With Blood Viscosity in American Indians The Strong Heart Study. Hypertension. 2005;45: 625–630. pmid:15699438
  5. 5. Chen Z, Tang H, Qayyum R, Schick UM, Nalls MA, Handsaker R, et al. Genome-wide association analysis of red blood cell traits in African Americans: the COGENT Network. Hum Mol Genet. 2013;22: 2529–2538. pmid:23446634
  6. 6. Honn KV, Tang DG, Crissman JD. Platelets and cancer metastasis: a causal relationship? Cancer Metastasis Rev. 1992;11: 325–351. pmid:1423821
  7. 7. Zoppo GJD. The role of platelets in ischemic stroke. Neurology. 1998;51: S9–S14. pmid:9744824
  8. 8. Pain A, Ferguson DJP, Kai O, Urban BC, Lowe B, Marsh K, et al. Platelet-mediated clumping of Plasmodium falciparum-infected erythrocytes is a common adhesive phenotype and is associated with severe malaria. Proc Natl Acad Sci U S A. 2001;98: 1805–1810. pmid:11172032
  9. 9. Willoughby S, Holmes A, Loscalzo J. Platelets and cardiovascular disease. Eur J Cardiovasc Nurs. 3rd ed. 2002;1: 273–288. pmid:14622657
  10. 10. McBane RD, Karnicki K, Miller RS, Owen WG. The impact of peripheral arterial disease on circulating platelets. Thromb Res. 2004;113: 137–145. pmid:15115669
  11. 11. Weber C. Platelets and chemokines in atherosclerosis: partners in crime. Circ Res. 2005;96: 612–616. pmid:15802619
  12. 12. Jain S, Harris J, Ware J. Platelets: linking hemostasis and cancer. Arterioscler Thromb Vasc Biol. 2010;30: 2362–2367. pmid:21071699
  13. 13. Soranzo N, Rendon A, Gieger C, Jones CI, Watkins NA, Menzel S, et al. A novel variant on chromosome 7q22.3 associated with mean platelet volume, counts, and function. Blood. 2009;113: 3831–3837. pmid:19221038
  14. 14. Alper AT, Sevimli S, Hasdemir H, Nurkalem Z, Güvenç TS, Akyol A, et al. Effects of high altitude and sea level on mean platelet volume and platelet count in patients with acute coronary syndrome. J Thromb Thrombolysis. 3rd ed. Springer US; 2009;27: 130–134. pmid:17978877
  15. 15. Hudson JG, Bowen AL, Navia P, Rios-Dalenz J, Pollard AJ, Williams D, et al. The effect of high altitude on platelet counts, thrombopoietin and erythropoietin levels in young Bolivian airmen visiting the Andes. Int J Biometeorol. Springer-Verlag; 1999;43: 85–90. pmid:10552312
  16. 16. Shrivastava A, Goyal A, (null) KN. Effect of high altitude on haematological parameters. Indian J Prev Soc Med. 2010;41: 2.
  17. 17. Soranzo N, Spector TD, Mangino M, Kühnel B, Rendon A, Teumer A, et al. A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium. Nat Genet. 2009;41: 1182–1190. pmid:19820697
  18. 18. Shameer K, Denny JC, Ding K, Jouni H, Crosslin DR, Andrade M de, et al. A genome- and phenome-wide association study to identify genetic variants influencing platelet count and volume and their pleiotropic effects. Hum Genet. 2014;133: 95–109. pmid:24026423
  19. 19. Lin J-P, O'Donnell CJ, Jin L, Fox C, Yang Q, Cupples LA. Evidence for linkage of red blood cell size and count: genome-wide scans in the Framingham Heart Study. Am J Hematol. 2007;82: 605–610. pmid:17211848
  20. 20. Thein SL, Menzel S, Peng X, Best S, Jiang J, Close J, et al. Intergenic variants of HBS1L-MYB are responsible for a major quantitative trait locus on chromosome 6q23 influencing fetal hemoglobin levels in adults. Proc Natl Acad Sci U S A. 2007;104: 11346–11351. pmid:17592125
  21. 21. Lettre G, Sankaran VG, Bezerra MAC, Araújo AS, Uda M, Sanna S, et al. DNA polymorphisms at the BCL11A, HBS1L-MYB, and beta-globin loci associate with fetal hemoglobin levels and pain crises in sickle cell disease. Proc Natl Acad Sci U S A. 2008;105: 11869–11874. pmid:18667698
  22. 22. Ferreira MAR, Hottenga J-J, Warrington NM, Medland SE, Willemsen G, Lawrence RW, et al. Sequence variants in three loci influence monocyte counts and erythrocyte volume. Am J Hum Genet. 2009;85: 745–749. pmid:19853236
  23. 23. Ganesh SK, Zakai NA, van Rooij FJA, Soranzo N, Smith AV, Nalls MA, et al. Multiple loci influence erythrocyte phenotypes in the CHARGE Consortium. Nat Genet. 2009;41: 1191–1198. pmid:19862010
  24. 24. Meisinger C, Prokisch H, Gieger C, Soranzo N, Mehta D, Rosskopf D, et al. A Genome-wide Association Study Identifies Three Loci Associated with Mean Platelet Volume. Am J Hum Genet. 2009;84: 66–71. pmid:19110211
  25. 25. Daly ME. Determinants of platelet count in humans. Haematologica. 2010;96: 10–13. pmid:21193429
  26. 26. Kamatani Y, Matsuda K, Okada Y, Kubo M, Hosono N, Daigo Y, et al. Genome-wide association study of hematological and biochemical traits in a Japanese population. Nat Genet. Nature Publishing Group; 2010;42: 210–215. pmid:20139978
  27. 27. Kullo IJ, Ding K, Jouni H, Smith CY, Chute CG. A Genome-Wide Association Study of Red Blood Cell Traits Using the Electronic Medical Record. PLoS ONE. 2010;5: e13011. pmid:20927387
  28. 28. Gieger C, Radhakrishnan A, Cvejic A, Tang W, Porcu E, Pistis G, et al. New gene functions in megakaryopoiesis and platelet formation. Nature. 2011;480: 201–208. pmid:22139419
  29. 29. Okada Y, Hirota T, Kamatani Y, Takahashi A, Ohmiya H, Kumasaka N, et al. Identification of nine novel loci associated with white blood cell subtypes in a Japanese population. PLoS Genet. 2011;7: e1002067. pmid:21738478
  30. 30. Paul DS, Nisbet JP, Yang T-P, Meacham S, Rendon A, Hautaviita K, et al. Maps of Open Chromatin Guide the Functional Follow-Up of Genome-Wide Association Signals: Application to Hematological Traits. PLoS Genet. 2011;7: e1002139. pmid:21738486
  31. 31. An P, Wu Q, Wang H, Guan Y, Mu M, Liao Y, et al. TMPRSS6, but not TF, TFR2 or BMP2 variants are associated with increased risk of iron-deficiency anemia. Hum Mol Genet. 2012;21: 2124–2131. pmid:22323359
  32. 32. Ding K, Shameer K, Jouni H, Masys DR, Jarvik GP, Kho AN, et al. Genetic Loci implicated in erythroid differentiation and cell cycle regulation are associated with red blood cell traits. Mayo Clin Proc. 2012;87: 461–474. pmid:22560525
  33. 33. Li J, Glessner JT, Zhang H, Hou C, Wei Z, Bradfield JP, et al. GWAS of blood cell traits identifies novel associated loci and epistatic interactions in Caucasian and African-American children. Hum Mol Genet. 2013;22: 1457–1464. pmid:23263863
  34. 34. Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, et al. Systematic Localization of Common Disease-Associated Variation in Regulatory DNA. Science. 2012;337: 1190–1195. pmid:22955828
  35. 35. Qayyum R, Snively BM, Ziv E, Nalls MA, Liu Y, Tang W, et al. A meta-analysis and genome-wide association study of platelet count and mean platelet volume in african americans. PLoS Genet. 2012;8: e1002491. pmid:22423221
  36. 36. van der Harst P, Zhang W, Mateo Leach I, Rendon A, Verweij N, Sehmi J, et al. Seventy-five genetic loci influencing the human red blood cell. Nature. 2012;492: 369–375. pmid:23222517
  37. 37. Cardoso GL, Diniz IG, Silva ANLMD, Cunha DA, Silva Junior JSD, Uchôa CTC, et al. DNA polymorphisms at BCL11A, HBS1L-MYB and Xmn1-HBG2 site loci associated with fetal hemoglobin levels in sickle cell anemia patients from Northern Brazil. Blood Cells Mol Dis. 2014;53: 176–179. pmid:25084696
  38. 38. Pei S-N, Ma M-C, You H-L, Fu H-C, Kuo C-Y, Rau K-M, et al. TMPRSS6 rs855791 Polymorphism Influences the Susceptibility to Iron Deficiency Anemia in Women at Reproductive Age. Int J Med Sci. 2014;11: 614–619. pmid:24782651
  39. 39. Grote Beverborg N, Verweij N, Klip IT, van der Wal HH, Voors AA, van Veldhuisen DJ, et al. Erythropoietin in the general population: reference ranges and clinical, biochemical and genetic correlates. PLoS ONE. 2015;10: e0125215. pmid:25915923
  40. 40. Mtatiro SN, Mgaya J, Singh T, Mariki H, Rooks H, Soka D, et al. Genetic association of fetal-hemoglobin levels in individuals with sickle cell disease in Tanzania maps to conserved regulatory elements within the MYB core enhancer. BMC Med Genet. 2nd ed. 2015;16: 4. pmid:25928412
  41. 41. Tapper W, Jones AV, Kralovics R, Harutyunyan AS, Zoi K, Leung W, et al. Genetic variation at MECOM, TERT, JAK2 and HBS1L-MYB predisposes to myeloproliferative neoplasms. Nat Commun. Nature Publishing Group; 2015;6: 1–11. pmid:25849990
  42. 42. Lai Y, Chen Y, Chen B, Zheng H, Yi S, Li G, et al. Genetic Variants at BCL11A and HBS1L-MYB loci Influence Hb F Levels in Chinese Zhuang β-Thalassemia Intermedia Patients. Hemoglobin. 2016;40: 405–410. pmid:28361591
  43. 43. Maharry SE, Walker CJ, Liyanarachchi S, Mehta S, Patel M, Bainazar MA, et al. Dissection of the Major Hematopoietic Quantitative Trait Locus in Chromosome 6q23.3 Identifies miR-3662 as a Player in Hematopoiesis and Acute Myeloid Leukemia. Cancer Discovery. 2016;6: 1036–1051. pmid:27354268
  44. 44. Mikobi TM, Tshilobo Lukusa P, Aloni MN, Lumaka AZ, Kaba DK, Devriendt K, et al. Protective BCL11A and HBS1L-MYB polymorphisms in a cohort of 102 Congolese patients suffering from sickle cell anemia. J Clin Lab Anal. 2018;32. pmid:28332727
  45. 45. Seiki T, Naito M, Hishida A, Takagi S, Matsunaga T, Sasakabe T, et al. Association of genetic polymorphisms with erythrocyte traits: Verification of SNPs reported in a previous GWAS in a Japanese population. Gene. 2018;642: 172–177. pmid:29133146
  46. 46. Verma A, Basile AO, Bradford Y, Kuivaniemi H, Tromp G, Carey D, et al. Phenome-Wide Association Study to Explore Relationships between Immune System Related Genetic Loci and Complex Traits and Diseases. PLoS ONE. 2016;11: e0160573. pmid:27508393
  47. 47. Verma A, Lucas A, Verma SS, Zhang Y, Josyula N, Khan A, et al. PheWAS and Beyond: The Landscape of Associations with Medical Diagnoses and Clinical Measures across 38,662 Individuals from Geisinger. Am J Hum Genet. 2018;102: 592–608. pmid:29606303
  48. 48. Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88: 76–82. pmid:21167468
  49. 49. Yuan X, Waterworth D, Perry JRB, Lim N, Song K, Chambers JC, et al. Population-Based Genome-wide Association Studies Reveal Six Loci Influencing Plasma Levels of Liver Enzymes. Am J Hum Genet. 2008;83: 520–528. pmid:18940312
  50. 50. Panova-Noeva M, Schulz A, Hermanns MI, Grossmann V, Pefani E, Spronk HMH, et al. Sex-specific differences in genetic and nongenetic determinants of mean platelet volume: results from the Gutenberg Health Study. Blood. 2016;127: 251–259. pmid:26518434
  51. 51. Chasman DI, Paré G, Mora S, Hopewell JC, Peloso G, Clarke R, et al. Forty-three loci associated with plasma lipoprotein size, concentration, and cholesterol content in genome-wide analysis. PLoS Genet. 2009;5: e1000730. pmid:19936222
  52. 52. Chambers JC, Zhang W, Sehmi J, Li X, Wass MN, van der Harst P, et al. Genome-wide association study identifies loci influencing concentrations of liver enzymes in plasma. Nat Genet. 2011;43: 1131–1138. pmid:22001757
  53. 53. Jin G, Sun J, Kim S-T, Feng J, Wang Z, Tao S, et al. Genome-wide association study identifies a new locus JMJD1C at 10q21 that may influence serum androgen levels in men. Hum Mol Genet. 2012;21: 5222–5228. pmid:22936694
  54. 54. Coviello AD, Haring R, Wellons M, Vaidya D, Lehtimäki T, Keildson S, et al. A genome-wide association meta-analysis of circulating sex hormone-binding globulin reveals multiple Loci implicated in sex steroid hormone regulation. PLoS Genet. 2012;8: e1002805. pmid:22829776
  55. 55. Grigorova M, Punab M, Poolamets O, Adler M, Vihljajev V, Laan M. Genetics of Sex Hormone-Binding Globulin and Testosterone Levels in Fertile and Infertile Men of Reproductive Age. J Endocr Soc. 2017;1: 560–576. pmid:29264510
  56. 56. Tajuddin SM, Schick UM, Eicher JD, Chami N, Giri A, Brody JA, et al. Large-Scale Exome-wide Association Analysis Identifies Loci for White Blood Cell Traits and Pleiotropy with Immune-Mediated Diseases. Am J Hum Genet. 2016;99: 22–39. pmid:27346689
  57. 57. Smyth DJ, Plagnol V, Walker NM, Cooper JD, Downes K, Yang JHM, et al. Shared and distinct genetic variants in type 1 diabetes and celiac disease. N Engl J Med. 2008;359: 2767–2777. pmid:19073967
  58. 58. de Boer YS, van Gerven NMF, Zwiers A, Verwer BJ, van Hoek B, van Erpecum KJ, et al. Genome-wide association study identifies variants associated with autoimmune hepatitis type 1. Gastroenterology. 2014;147: 443–52.e5. pmid:24768677
  59. 59. Giusti B, Marcucci R, Saracini C, Gori AM, Valenti R, Parodi G, et al. Mean platelet volume and platelet count in acute coronary syndrome patients: role of a genetic variants on chr7q22.3 and chr3p13-p21. Eur Heart J. 2013;34: P4879–P4879.
  60. 60. Johnson AD. The genetics of common variation affecting platelet development, function and pharmaceutical targeting. J Thromb Haemost. 2011;9 Suppl 1: 246–257. pmid:21781261
  61. 61. Zou S, Teixeira AM, Kostadima M, Astle WJ, Radhakrishnan A, Simon LM, et al. SNP in human ARHGEF3 promoter is associated with DNase hypersensitivity, transcript level and platelet function, and Arhgef3 KO mice have increased mean platelet volume. PLoS ONE. 2017;12: e0178095. pmid:28542600
  62. 62. Goggs R, Williams CM, Mellor H, Poole AW. Platelet Rho GTPases-a focus on novel players, roles and relationships. Biochem J. 2015;466: 431–442. pmid:25748676
  63. 63. Kojima H, Kanada H, Shimizu S, Kasama E, Shibuya K, Nakauchi H, et al. CD226 Mediates Platelet and Megakaryocytic Cell Adhesion to Vascular Endothelial Cells. J Biol Chem. 2003;278: 36748–36753. pmid:12847109
  64. 64. Crowder RJ, Enomoto H, Yang M, Johnson EM, Milbrandt J. Dok-6, a Novel p62 Dok Family Member, Promotes Ret-mediated Neurite Outgrowth. J Biol Chem. 2004;279: 42072–42081. pmid:15286081
  65. 65. Konstantakis C, Tselekouni P, Kalafateli M, Triantos C. Vitamin D deficiency in patients with liver cirrhosis. Ann Gastroenterol. 2016;29: 297–306. pmid:27366029
  66. 66. Massey AC. Microcytic anemia. Differential diagnosis and management of iron deficiency anemia. Med Clin North Am. 1992;76: 549–566. pmid:1578956
  67. 67. Evstatiev R, Bukaty A, Jimenez K, Kulnigg Dabsch S, Surman L, Schmid W, et al. Iron deficiency alters megakaryopoiesis and platelet phenotype independent of thrombopoietin. Am J Hematol. Wiley Online Library; 2014;89: 524–529. pmid:24464533
  68. 68. Besancenot R, Roos-Weil D, Tonetti C, Abdelouahab H, Lacout C, Pasquier F, et al. JAK2 and MPL protein levels determine TPO-induced megakaryocyte proliferation vs differentiation. Blood. 2014;124: 2104–2115. pmid:25143485
  69. 69. Noris P, Klersy C, Gresele P, Giona F, Giordano P, Minuz P, et al. Platelet size for distinguishing between inherited thrombocytopenias and immune thrombocytopenia: a multicentric, real life study. Br J Haematol. 2013;162: 112–119. pmid:23617394
  70. 70. Mozos I, Marginean O. Links between Vitamin D Deficiency and Cardiovascular Diseases. Biomed Res Int. Hindawi; 2015;2015: 109275–12. pmid:26000280
  71. 71. Cumhur Cure M, Cure E, Yuce S, Yazici T, Karakoyun I, Efe H. Mean platelet volume and vitamin D level. Ann Lab Med. 2014;34: 98–103. pmid:24624344
  72. 72. Park YC, Kim J, Seo MS, Hong SW, Cho ES, Kim J-K. Inverse relationship between vitamin D levels and platelet indices in Korean adults. Hematology. 2017;22: 1–7.
  73. 73. Gaspar RS, Trostchansky A, Paes AM de A. Potential Role of Protein Disulfide Isomerase in Metabolic Syndrome-Derived Platelet Hyperactivity. Oxid Med Cell Longev. Hindawi; 2016;2016: 2423547–10. pmid:28053690
  74. 74. Vaidya D, Yanek LR, Faraday N, Moy TF, Becker LC, Becker DM. Native platelet aggregation and response to aspirin in persons with the metabolic syndrome and its components. Metab Syndr Relat Disord. 2009;7: 289–296. pmid:19351291
  75. 75. Samocha-Bonet D, Justo D, Rogowski O, Saar N, Abu-Abeid S, Shenkerman G, et al. Platelet counts and platelet activation markers in obese subjects. Mediators Inflamm. 2008;2008: 834153. pmid:18385810
  76. 76. Kurokawa T, Ohkohchi N. Platelets in liver disease, cancer and regeneration. World J Gastroenterol. 2017;23: 3228–3239. pmid:28566882
  77. 77. Chauhan A, Adams DH, Watson SP, Lalor PF. Platelets: No longer bystanders in liver disease. Hepatology. Wiley-Blackwell; 2016;64: 1774–1784. pmid:26934463
  78. 78. Míguez-Burbano MJ, Nair M, Lewis JE, Fishman J. The role of alcohol on platelets, thymus and cognitive performance among HIV-infected subjects: are they related? Platelets. 2009;20: 260–267. pmid:19459132
  79. 79. Mikhailidis DP, Jenkins WJ, Barradas MA, Jeremy JY, Dandona P. Platelet function defects in chronic alcoholism. Br Med J (Clin Res Ed). British Medical Journal Publishing Group; 1986;293: 715–718. pmid:3094624
  80. 80. Anderson CA, Pettersson FH, Clarke GM, Cardon LR, Morris AP, Zondervan KT. Data quality control in genetic case-control association studies. Nature Protocols. Nature Publishing Group; 2010;5: 1564–1573. pmid:21085122
  81. 81. Schlauch KA, Khaiboullina SF, De Meirleir KL, Rawat S, Petereit J, Rizvanov AA, et al. Genome-wide association analysis identifies genetic variations in subjects with myalgic encephalomyelitis/chronic fatigue syndrome. Transl Psychiatry. 2016;6: e730–e730. pmid:26859813
  82. 82. Schlauch KA, Kulick D, Subramanian K, De Meirleir KL, Palotás A, Lombardi VC. Single-nucleotide polymorphisms in a cohort of significantly obese women without cardiometabolic diseases. Int J Obes (Lond). Nature Publishing Group; 2018;106: 1656. pmid:30120429
  83. 83. Winkler TW, Day FR, Croteau-Chonka DC, Wood AR, Locke AE, Mägi R, et al. Quality control and conduct of genome-wide association meta-analyses. Nature Protocols. 2014;9: 1192–1212. pmid:24762786
  84. 84. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am J Hum Genet. 2007;81: 559–575. pmid:17701901
  85. 85. Gauderman WJ. Sample size requirements for matched case‐control studies of gene–environment interaction. Stat Med. Wiley Online Library; 2002;21: 35–50. pmid:11782049
  86. 86. Carroll RJ, Bastarache L, Denny JC. R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics. 2014;30: 2375–2376. pmid:24733291
  87. 87. Denny JC, Bastarache L, Ritchie MD, Carroll RJ, Zink R, Mosley JD, et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat Biotechnol. 2013;31: 1102–1110. pmid:24270849