Introduction

Lower cognitive ability, lower educational attainment and greater cognitive decline are all associated with poorer health outcomes1,2,3. Some of these associations possibly arise because of the effect of lower cognitive ability in childhood on later life health, others because illnesses may lower cognitive ability in later life. The causes of these associations are unclear, but some may reflect, in part, a shared genetic aetiology. Recent papers have reported genetic associations between cognitive ability and educational attainment, and a number of physical and mental health traits and diseases4,5,6. These4, 6, and other papers7,8,9, have shown successful use of educational attainment as a proxy for cognitive ability, showing phenotypic correlations between educational attainment and general cognitive ability around 0.509 and a genetic correlation of 0.724.

Some of the reciprocal phenotypic associations between cognitive and physical health variables, and their genetic correlations, are as follows. Short stature has been consistently linked with lower cognitive ability10, 11. Molecular genetic studies have indicated positive genetic correlations between height and cognitive ability4, 12, as well as between height and educational attainment4, 5. Higher polygenic scores for height have been associated with better cognitive ability in adulthood4. A causal association was reported between taller stature and educational attainment (not including individuals with a degree) in UK Biobank using a Mendelian randomization analysis13.

Multiple studies have shown associations between cognitive ability and cardiovascular risk factors. For example, lower childhood cognitive ability is associated with subsequent high blood pressure14 and obesity15. However, higher BMI in mid-life16 and both hypertension and hypotension17 are associated with lower cognitive ability and greater cognitive decline in later life. A negative genetic correlation has been identified between BMI, but not blood pressure, and educational attainment and cognitive ability in mid to late life4, 5, and a polygenic score for higher BMI is associated with lower cognitive ability in mid to late life and lower educational attainment4; however, a polygenic score for higher systolic blood pressure is associated with lower educational attainment, but higher cognitive ability in mid to late life4.

Similarly, associations have been identified between cognitive ability and cardio metabolic diseases. Childhood cognitive ability has been associated with developing diabetes18 and coronary artery disease19 later in life. Diabetes20 and coronary artery disease21, 22 in midlife have been associated with greater cognitive decline later in life. A polygenic risk score for type 2 diabetes is associated with lower educational attainment, but not with cognitive ability in mid to late life4, although one has been associated with reduced cognitive decline23. To date, no genetic correlation between diabetes and cognitive ability has been identified4, 5. A polygenic risk score for coronary artery disease is associated with lower educational attainment and lower mid to late life cognitive ability4, and a negative genetic correlation was identified between coronary artery disease and educational attainment4, 5, but not cognitive ability in mid to late life4.

The question arises as to whether the genetic cognitive-health associations caused by: (1) genes influencing health traits/diseases, and then those health traits/diseases subsequently influencing cognitive ability; (2) genes influencing cognitive ability, and then cognitive ability subsequently influencing health traits/diseases; (3) genes influencing general bodily system integrity24 that influences both cognitive ability and health traits/diseases?

To try to make some progress in understanding causality of the correlation between cognitive ability and a number of physical and mental health traits, in the present report we used a bidirectional, two-sample Mendelian randomization (MR) approach25. MR uses genetic variants as proxies for environmental exposures and is subject to the following assumptions: (1) the genetic variants are associated with the exposure; (2) the genetic variants are only associated with the outcome of interest via their effect on the exposure [i.e., there is no biological pleiotropy (the phenomenon whereby one SNP independently influences multiple traits), also called the exclusion restriction]; and 3) the genetic variants are independent of confounders. FigureĀ 1 shows the Mendelian randomization study model; the instrumental variable, here based on genome-wide significant SNPs from independent studies for the exposure, is used to estimate if the exposure (e.g. BMI) causally influences the outcome (e.g. cognitive ability). Individual single nucleotide polymorphisms (SNPs) are often found to be weak instruments for investigating causality because they often have small effect sizes. Using multiple SNPs can increase the strength of the instrument. However, this increases the chance of violating the MR assumptions, specifically violation of the assumption that the genetic variants affect the outcome only via the exposure. We used multiple genetic variants for a number of health-related traits and diseases, previously identified in genome-wide association studies, as instrumental variables to see if they predicted cognitive ability (verbal-numerical reasoning) in mid to later life in the UK Biobank. We then used genome-wide significant educational attainment SNPs as an instrumental variable to test whether genetic differences associated with educational attainment (a proxy measure of cognitive ability in early life6, 8) predict later life health outcomes in the UK Biobank.

Figure 1
figure 1

Model for Mendelian randomization study. The instrumental variable, based on genome-wide significant SNPs from independent studies for the exposure, is used to estimate if the exposure (e.g. BMI) causally influences the outcome (e.g. cognitive ability). The instrumental variable should be unrelated to potential confounders of the exposure-outcome association and should only affect the outcome via the exposure.

Methods

Sample

This study uses baseline data from the UK Biobank Study, a large resource for identifying determinants of human diseases in middle aged and older individuals26. Around 500 000 community-dwelling participants aged between 37 and 73 years were recruited and underwent assessments between 2006 and 2010 in the United Kingdom. This included cognitive and physical assessments, providing blood, urine and saliva samples for future analysis, and giving detailed information about their backgrounds and lifestyles, and agreeing to have their health followed longitudinally. For the present study, genome-wide genotyping data were available on 112ā€‰151 individuals (58ā€‰914 females) aged 40ā€“70 years (mean ageā€‰=ā€‰56.9 years, SDā€‰=ā€‰7.9) after the quality control process which is described in more detail elsewhere4. The UK Biobank study was approved by the National Health Service (NHS) Research Ethics Service (approval letter dated 17th June 2011, reference: 11/NW/0382). The analyses in the present report were completed under UK Biobank application 10279. All experiments were performed in accordance with guidelines and regulations from these committees. Written informed consent was obtained from each subject.

Measures

Body mass index

Body mass index (BMI) was calculated as weight(kg)/height(m)2, and measured using an impedance measure, i.e. a Tanita BC418MA body composition analyser, to estimate body composition. We used the average of the two methods when both measures were available (rā€‰=ā€‰0.99); if only one measure was available, that measure was used (Nā€‰=ā€‰1629). 291 individuals did not have information on BMI. One outlier was excluded based on visual inspection of the BMI distribution (BMIā€‰>ā€‰50). 111 712 individuals had valid BMI and genetic data.

Height

Standing and sitting height (cm) were measured using a Seca 202 device. We used standing height and excluded one individual based on the visual inspection of the height distribution with a standing height <125ā€‰cm and a sitting/standing height ratio <0.75. 111 959 had valid height and genetic data.

Systolic blood pressure

Systolic blood pressure was measured twice, a few moments apart, using the Omron Digital blood pressure monitor. A manual sphygmomanometer was used if the digital blood pressure monitor could not be employed (Nā€‰=ā€‰6652). Systolic blood pressure was calculated as the average of measures at the two time points (for either automated or manual readings). Individuals with a history of coronary artery disease were excluded from the analysis (Nā€‰=ā€‰2513). Following the recommendation by Tobin, et al.27, 15ā€‰mmHg was added to the average systolic blood pressure of individuals taking antihypertensive medication (Nā€‰=ā€‰10 988). Individuals with a systolic blood pressure (after correcting for medication) more than 4 SD from the mean were excluded from future analyses (Nā€‰=ā€‰75). After all exclusions, 106 759 individuals remained with valid blood pressure and genetic data.

Coronary artery disease

UK Biobank participants completed a touch screen questionnaire on past and current health, which included the question ā€œHas a doctor ever told you that you have had any of the following conditions? heart attack/angina/stroke/high blood pressure/none of the above/prefer not to answerā€. This was followed by a verbal interview with a trained nurse who was made aware if the participant had a history of certain illnesses and confirmed these diagnoses with the participant. For the present study, coronary artery disease was defined as a diagnosis of myocardial infarct or angina, reported during both the touchscreen and the verbal interview in individuals with genetic data (Nā€‰=ā€‰5288). The control group (Nā€‰=ā€‰104 784) consisted of participants who reported none of the following diseases (based on the non-cancer illness code provided by UK Biobank): myocardial infarction, angina, heart failure, cerebrovascular disease, stroke, transient ischaemic attack, subdural haemorrhage, cerebral aneurysm, peripheral vascular disease, leg claudication/intermittent claudication, arterial embolism.

Type 2 diabetes

Type 2 diabetes case-control status was created using the same method as described by Wood, et al.28, for all individuals with genetic data based on the interim release of UK Biobank. Cases included participants who reported type 2 diabetes or generic diabetes during the nurse interview, started insulin treatment at least one year after diagnosis, were older than 35 years at the time of diagnosis, and did not receive a diagnosis one year prior to baseline testing (Nā€‰=ā€‰3764). The control group consisted of participants who did not fulfil these criteria, and did not report a diagnosis of type 1 diabetes, diabetes insipidus and gestational diabetes (Nā€‰=ā€‰108 015).

Years of education

As part of the sociodemographic questionnaire in the study, participants were asked, ā€œWhich of the following qualifications do you have? (You can select more than one)ā€. Possible answers were: ā€œCollege or University Degree/A levels or AS levels or equivalent/O levels or GCSE or equivalent/CSEs or equivalent/NVQ or HND or HNC or equivalent/Other professional qualifications e.g. nursing, teaching/None of the above/Prefer not to answerā€. For the present study, a new continuous variable was created measuring ā€˜years of education completedā€™. This was based on the ISCED coding, using the 1997 International Standard Classification of Education (ISCED) of the United Nations Educational, Scientific and Cultural Organization29. See the TableĀ 1 for further details. Individuals who reported that they had a NVQ or HND or HNC degree, individuals who reported other qualifications, and individuals who preferred not to answer were excluded from analyses. The reason for these exclusions was as follows: the first two categories would correspond to 15 and 19 years of education according to the ISCED coding; regarding their mean scores on cognitive ability tests, this might not be the right place for these two degree levels in the ordered hierarchy of educational attainments (Supplementary FigureĀ 1). For the current study, years of education was used a proxy phenotype for cognitive ability4, 6, 8. A total of 97,550 individuals had valid data for the years of education variable.

Table 1 Coding for years of education in UK Biobank based on the ISCED coding29.

Cognitive ability

Cognitive ability was measured using a 13-item touchscreen computerized verbal-numerical reasoning test. The test included six verbal and seven numerical questions, all with multiple-choice answers, with a two-minute time limit. An example verbal item is: ā€˜If some flinks are plinks and some plinks are stinks then some flinks are definitely stinks?ā€™ (possible answers: ā€˜True/False/Neither-true-nor-false/do not know/prefer not to answerā€™). An example numerical item is: ā€˜If sixty is more than half of seventy-five, multiply twenty-three by three. If not subtract 15 from eighty-five. Is the answer?ā€™ (possible answers: ā€˜68/69/70/71/72/do not know/prefer not to answerā€™). The cognitive ability score was the total score out of 13 (further detail can be found in Hagenaars, et al.4). This test was introduced at a later stage during baseline assessment and only a subset of individuals therefore completed this test. A total of 36 035 had valid cognitive ability and genetic data.

Covariates

All analyses were adjusted for the following covariates: age when attending assessment centre, sex, genetic batch and array, and the first ten genetic principal components for population stratification.

Instrumental variables

SNPs associated with each of the five health outcomes and educational attainment were retrieved from the largest available GWAS in European samples for the variables of interest (BMI30, height28, systolic blood pressure31, coronary artery disease32, type 2 diabetes33, and educational attainment34). For educational attainment, we downloaded the summary statistics based on the discovery GWAS only, which did not include the UK Biobank sample. Corresponding SNPs used in the instrumental variables were then extracted from the imputed UK Biobankā€™s interim release of genotypes, which amounted to 112 151 individuals of self-reported White British ancestry after quality control. Details on the quality control process have been published previously4. SNPs out of Hardy-Weinberg equilibrium (HWE, pā€‰<ā€‰1ā€‰Ć—ā€‰10āˆ’6), with an imputation quality below 0.9, or individual genotypes with a genotype probability below 0.9 and strand ambiguous SNPs were excluded from the instrumental variables. The individual variants were recoded as 0, 1 or 2 according to the number of trait increasing alleles. TableĀ 2 includes information on the number of SNPs included, and the reference paper. Supplementary TableĀ 3aā€“f provides details of the included SNPs.

Table 2 Information about instrumental variables.

Statistical analysis

Phenotypic associations

We performed linear regression analysis using BMI, height, systolic blood pressure, coronary artery disease, and type 2 diabetes to predict cognitive ability. We regressed BMI, height, and systolic blood pressure against educational attainment in a linear regression model; coronary artery disease and type 2 diabetes were regressed against educational attainment in logistic regression models.

Mendelian randomization analysis

The Mendelian randomization analysis was performed using inverse variance weighted regression analysis based on SNP level data, with each instrumental variable (IV) consisting of multiple SNPs25. The inverse variance weighted method is based on a regression of two vectors with the intercept constrained to zero, i.e. the genetic variant with the exposure association, and the genetic variant with the outcome association (Fig.Ā 1). By constraining the intercept to zero, this method assumes that all variants are valid instrumental variables based on the Mendelian randomization assumptions. We performed an association analysis between each SNP in the instrumental variable for the exposure and the exposure itself (IV - exposure), as well as between the instrumental variable for the exposure and the outcome (IV - outcome). We then used the vector of the instrumental variable-outcome association analyses against the vector of the instrumental variable-exposure analyses. This association (vector IV - outcomeā€‰~ā€‰vector IV - exposure) was weighted by the standard error of the original IV-outcome association, to correct for minor allele frequency, as described by Bowden, et al.25. Power calculations for the MR analyses can be found in Supplementary TableĀ 1. No sensitivity analyses were performed due to the lack of causal associations.

Results

Health outcomes predicting cognitive ability

BMI, height, systolic blood pressure, and coronary artery disease predicted performance on the verbal-numerical reasoning test of cognitive ability (TableĀ 3). A 1 SD higher BMI was associated with a 0.05ā€‰SD lower score for cognitive ability (Ī²ā€‰=ā€‰āˆ’0.05, 95% CIā€‰=ā€‰āˆ’0.06, āˆ’0.04). A 1 SD greater height was associated with a 0.18ā€‰SD higher score for cognitive ability (Ī²ā€‰=ā€‰0.18, 95% CIā€‰=ā€‰0.17, 0.20). A 1 SD higher systolic blood pressure was associated with a 0.05ā€‰SD lower score for cognitive ability (Ī²ā€‰=ā€‰āˆ’0.05, 95% CIā€‰=ā€‰āˆ’0.06, āˆ’0.04). Individuals with coronary artery disease had, on average, a 0.27 SD lower score for cognitive ability (Ī²ā€‰=ā€‰āˆ’0.27, 95% CIā€‰=ā€‰āˆ’0.32, āˆ’0.21). Individuals with type 2 diabetes had, on average, a 0.06ā€‰SD lower score for cognitive ability (Ī²ā€‰=ā€‰āˆ’0.06, 95% CIā€‰=ā€‰āˆ’0.12, 0.01). The Mendelian randomization inverse variance weighted analyses, with the five health outcomes as the exposures, and cognitive ability as the outcome, did not provide any causal evidence for any of these associations.

Table 3 Phenotypic and genetic associations, using Mendelian randomization analysis, between five health instrumental variables and cognitive ability, using the verbal-numerical reasoning test.

Education predicting health outcomes

Educational attainment, as measured by years of education, predicted BMI, height, systolic blood pressure, type 2 diabetes and coronary artery disease (TableĀ 4). The difference between 7 and 20 years of education was associated with a 0.37ā€‰SD lower BMI (Ī²ā€‰=ā€‰āˆ’0.37, 95% CIā€‰=ā€‰āˆ’0.39, āˆ’0.35), 0.31ā€‰SD taller stature (Ī²ā€‰=ā€‰0.31, 95% CIā€‰=ā€‰0.30, 0.32), 0.20 lower SBP (Ī²ā€‰=ā€‰āˆ’0.20, 95% CIā€‰=ā€‰āˆ’0.22, āˆ’0.19), 0.58 lower odds of type 2 diabetes (ORā€‰=ā€‰0.58, 95% CIā€‰=ā€‰0.52, 0.64), and 0.40 lower odds of coronary artery disease (ORā€‰=ā€‰0.40, 95% CIā€‰=ā€‰0.37, 0.43). The differences between the other groups (7 versus 10 and 13 years of education) can be found in Supplementary TableĀ 2. In every case, the Mendelian randomization inverse variance weighted method did not show a causal effect of educational attainment on the health outcomes. The full results can be found in TableĀ 4.

Table 4 Phenotypic and genetic associations, using Mendelian randomization analysis, between the educational attainment instrumental variable and five health outcomes.

Discussion

This study was designed to investigate causes of the well replicated finding that lower cognitive ability is associated with poorer health outcomes1,2,3. It used a bidirectional two-sample MR approach to investigate this. We found no evidence for causal association between several health outcomes and cognitive ability, in middle and older age, or between educational attainment and physical health.

Tyrrell, et al.13 showed a causal association between taller stature and time spent in full time education in UK Biobank. They did not find a causal association between taller stature and degree level. The measure of time spent in full time education in UK Biobank excluded individuals who reported having a college degree, which could explain the discrepancy in results. The current study did include individuals who reported having a college degree, however used a categorical measure of four categories, whereas Tyrrell, et al.13 used a continuous measure of time spent in full time education. In a non-peer-reviewed (at the time of writing) study, Tillmann, et al.35 did report a causal association from educational attainment to coronary artery disease and BMI using a two-sample MR approach based on two independent consortia35. They used data from two independent GWAS consortia, including 349,306 individuals for educational attainment, 194,427 (63,746 cases) individuals for coronary artery disease, and 339,224 individuals for BMI. The current study used the same data for educational attainment on a subset of individuals (Nā€‰=ā€‰293,723), and 111,712 individuals with BMI data; however, coronary artery disease was based on self-report diagnosis in UK Biobank, which included 110,072 (5288 cases) individuals. The summary level data for coronary artery disease in the Tillmann, et al.35 report included both European and East-Asian individuals, whereas the current study only includes individuals of White British ancestry. They35 excluded overlapping cohorts between educational attainment and coronary artery disease data; however, it is unclear if overlapping cohorts were excluded for BMI.

Another explanation for the lack of causal associations in the present study could be the high polygenic aetiology of the traits analysed in this study. Instrumental variables for cardiovascular disease, type 2 diabetes, blood pressure, and educational attainment explain a small amount of the variance in the exposure. A better instrumental variable would be expected to explain a substantial amount of the variance of the exposure. As shown by the power calculations (Supplementary TableĀ 1), all instrumental variables (except BMI and systolic blood pressure) had sufficient power to detect the same magnitude of association as the observational estimates. The low power for BMI and systolic blood pressure potentially explains the lack of association with cognitive ability. A previous study by the current authors indicated a degree of genetic overlap between cognitive ability and health across the genome4. The idea of genetic overlap between health and cognitive ability is consistent with the theoretical construct of bodily system integrity24, whereby a latent trait is manifest as individual differences in how effectively people meet cognitive and health challenges from the environment, and which has some genetic aetiology.

Strengths of this study include the large sample size of UK Biobank, the participants of which all took the same cognitive tests, completed the same questionnaires and answered the same interview questions, in contrast to most genetic studies, where assessments across different cohorts often vary. A further strength is the fact that all of the UK Biobank genetic data were processed in a consistent matter, on the same platform and at the same location. The genetic variants on which the instrumental variables originated used the largest available GWAS at moment of testing.

Limitations of this study include the fact that cognitive ability was only measured on a subset of the UK Biobank participants and that it was a bespoke test. A second major limitation was that there is no published large genome-wide association study of cognitive ability in early life from which we could obtain genetic variants to use as an instrumental variable. Therefore, we used genome-wide significant SNPs associated with educational attainment as our early life cognitive ability instrument. A further limitation is the case-control ascertainment in UK Biobank, as the current study based case-control status on self-report measures. This may have led to misclassification of disease status, causing a likely bias towards the null hypothesis36.

Overall, this study found phenotypic cognitive-physical health associations, but did not find evidence for causal associations between cognitive ability and physical health. This may be due to weak instrumental variables, poorly measured outcomes, or the small numbers of disease cases. Future work should therefore focus on stronger instrumental variables, as well as better measurement of the outcome variables.