Introduction

Alcohol consumption is the world’s third largest risk factor for disease burden and is associated with diseases, including neuropsychiatric disorders,1, 2, 3, 4 cardiovascular diseases, cirrhosis of the liver, various cancers and fetal alcohol syndrome. Each year, an estimated 2.5 million people die from alcohol-related disease worldwide.5 Biomarkers for alcohol intake include direct blood alcohol concentration, γ-glutamyltransferase activity, carbohydrate deficient transferrin6 or mean corpuscular volume of erythrocytes.7 Nevertheless, further research is needed to understand alcohol-specific metabolic responses and the underlying pathophysiology. For example, identification of potential biomarkers for monitoring of alcohol consumption or determination of pharmacotherapy targets could facilitate early intervention for patients with specific alcohol-related disorders.

Targeted metabolomics is a promising method that can elucidate the effect of alcohol consumption on human metabolism. Metabolites are products of cellular processes, and their levels can be regarded as the ultimate response of biological systems to genetic or environmental changes.8, 9, 10, 11 Recent advances in metabolomic technologies have enabled high-throughput measurement of not only one but several compound classes simultaneously (for example, amino acids, sugars, glycerophospholipids)12, 13 resulting in a fast and more comprehensive identification of candidate biomarkers. As far as we are aware, no large-scale metabolic profiling analyses of humans with alcohol consumption have yet been conducted.

The aims of the underlying study were to (1) investigate the relation of alcohol intake and serum metabolite concentrations in German and UK studies and (2) identify potential biomarkers that could predict high levels of intake.

Materials and methods

KORA F4 study population

Cooperative Health Research in the Region of Augsburg (KORA) is a population-based research platform with subsequent follow-up studies in the fields of epidemiology and health-care research.14, 15, 16 The KORA F4 study is the follow-up of KORA-Survey 4 (S4, 1999/2001) conducted in 2006/2008. In all, 3080 individuals participated in the follow-up study. For 3061 individuals, metabolic data was available.17, 18 From 3061 individuals, 1144 males and 946 females aged 32–81 years were selected for further analysis after application of the following exclusion criteria: non-fasting at examination, diabetic, alcohol abstainer, missing data or outliers (that is, extreme low or high values) in metabolite concentration data (see Statistical analysis section for outlier detection calculation). Study participants were categorized according to daily alcohol intake as light drinkers (LD; females <20 g day−1 and males <40 g day−1) and moderate-to-heavy drinkers (MHD; females 20 g day−1 and males 40 g day−1).

KORA F3 replication data set

The KORA F3 study is a follow-up of the KORA-Survey 3 (S3, examined in 1994/95), conducted in 2004/05. The KORA F3 cohort is a 10-year follow-up survey of the KORA S3 survey. A total of 2974 individuals participated in the follow-up. From 2974 individuals, 377 individuals had metabolic data available. In all, 154 males and 107 females aged 55–84 years were selected for further analysis after the application of KORA F4 exclusion criteria. KORA F4 and KORA F3 are two independent cohorts and do not contain common participants and were conducted at different time points.19, 20

TwinsUK replication data set

The UK Adult Twin Registry (TwinsUK) is a UK-wide twin registry sample of 11 000 adults founded in 1993 with the aim to explore the genetic epidemiology of common adult diseases.21 A total of 629 individuals aged 23–73 years were selected for analysis after the application of KORA F4 exclusion criteria. For 277 probands, high-density lipoproteins (HDL) data were available.

Ethics statement

Written informed consent has been given by each KORA and TwinsUK participant. The KORA studies, including the protocols for subject recruitment and assessment and the informed consent for participants, were reviewed and approved by the local ethical committee (Bayerische Landesärztekammer). For the TwinsUK study, ethics approval was received from the St Thomas’ Hospital Ethics.

Blood sampling

KORA F4 and F3 blood samples for metabolic analysis were collected using the similar collection procedures together with medical examinations described previously.22, 23, 24 KORA F4 blood samples were drawn into serum tubes in the morning between 0800 and 1030 hours after overnight fasting. Tubes were gently inverted twice, followed by 30-min resting at room temperature to obtain complete coagulation. For serum collection, centrifugation of blood was performed for 10 min (2750 g, 15 °C). Serum was frozen at −80 °C until execution of metabolic analyses.

In the TwinsUK study, similar collection procedure was used as that in the KORA study. TwinsUK blood samples were taken after at least 6 h of overnight fasting. The samples were immediately inverted three times, followed by 40-min resting at 4 °C to obtain complete coagulation. The samples were then centrifuged for 10 min at 2000 g. Serum was removed from the centrifuged brown-topped tubes as the top, yellow, translucent layer of liquid. Four aliquots of 1.5 ml were placed into skirted micro centrifuge tubes and then stored in a −45 °C freezer until sampling.25

Metabolite measurements

Metabolomic analysis was performed on 3061 subjects from the KORA F4 study, 377 subjects from the KORA F3 study and 629 TwinsUK study. Comparison of metabolite concentrations (that is, comparison between LD and MHD) was conducted within the same cohort and within the same site of collection. The targeted metabolomic approach was based on flow injection analysis coupled with electrospray ionization tandem mass spectrometry measurements by AbsoluteIDQ p150 assay (BIOCRATES Life Sciences AG, Innsbruck, Austria). The method of AbsoluteIDQ p150 assay has been proven to be in conformance with FDA-Guideline ‘Guidance for Industry—Bioanalytical Method Validation (May 2001)’,26 which implies proof of reproducibility within a given error range. The assay procedures of the AbsoluteIDQ p150 kit as well as the metabolite nomenclature have been described in detail previously.2, 27 Data evaluation for quantification of metabolite concentrations and quality assessment is performed with the MetIQ software package, which is an integral part of the AbsoluteIDQ kit. Internal standards serve as reference for the calculation of metabolite concentrations. To ensure data quality, each metabolite had to meet the three criteria described previously:17, 19 (1) average value of the coefficient of variance for the metabolite in the three quality controls should be smaller than 25%; (2) 90% of all the measured sample concentrations for the metabolite should be above the limit of detection; and (3) the correlation coefficient between two duplicate measurements of the metabolite in 144 re-measured samples should be above 0.5. In total, 131 metabolites passed the three quality controls, and the final metabolomics data set contained the sum of hexoses (H1), 14 amino acids, 24 acylcarnitines, 13 sphingomyelins, 34 diacylphosphatidylcholines (PCs), 37 acyl-alkyl-phosphatidylcholines and 8 lysophosphatidylcholines (lysoPCs). Supplementary Table S1 summarizes the characteristics of 163 metabolites measured in KORA F4.

Statistical analysis

Statistical analysis was performed with the open source software R (version 2.14.1). To detect outliers, concentrations obtained for the 131 metabolites were first scaled to zero mean and unity s.d. and were projected onto the unit sphere, and Mahalanobis distances for each individual were then calculated using the robust principal components algorithm.28 Calculations were done separately for males and females. For each group, the mean Mahalanobis distance plus three times variance were defined as the cutoff. Missing values were imputed using the R package ‘mice’.29 Metabolite concentrations were logarithmized for all subsequent analysis steps. Shapiro–Wilk test30 was applied on single metabolites to check for normal distribution of metabolites in the study population in order to choose proper follow-up tests. Mann–Whitney test31 was applied for the comparison of two variables not satisfying normal distribution. Fisher’s exact test32 was applied for comparing binomial proportions.

Logistic regression33 was applied on each of the 131 metabolites to investigate associations of metabolites between MHD and LD. P-values were corrected according to the Bonferroni correction, at a level of 3.8E−4 (for a total use of 131 metabolites at the 5% level). To further select candidate biomarkers, two additional methods were applied:2, 8 the random forest selection34 and the stepwise selection, which assess the metabolites as a group. Between the two groups, the random forest was first used to select the metabolites among the 30 highest ranking variables of importance score, allowing the best separation of the individuals from different groups. Age, body mass index (BMI), smoking, HDL and triglycerides were also included in this method with all the metabolites. We further selected the metabolites using stepwise selection on the logistic regression model. Metabolites with significantly different concentrations between the compared groups in logistic regression, and which were also selected using random forest, were used in this model along with all the covariates. Akaike’s Information Criterion was used to evaluate the performance of these subsets of metabolites used in the models. The model with minimal Akaike’s Information Criterion was chosen. The area under the receiver-operating characteristic curves (AUC) was used to evaluate the models.

Heat maps were used to illustrate the trends of metabolite concentrations with increasing alcohol consumption. Alcohol consumption data were split into alcohol consumption categories increasing by 5 g day−1. A matrix of mean metabolite concentrations was calculated for each alcohol consumption category for significant male/female-specific metabolites from logistic regression. In the same procedure, step hierarchical clustering with Euclidean distance was applied on the metabolite concentration matrix to generate a hierarchical dendogram clustering metabolites with similar mean metabolite concentrations. For the meta-analysis of the combined KORA F4 and KORA F3 studies, a fixed effect model was used.

Results

Description of the study populations

Based on previous results from KORA F4, which showed strong metabolomic differences between men and women,19 we conducted strictly sex-separated analyses. For both sexes, we classified our probands into two groups according to daily alcohol consumption of LD and MHD and compared MHD with LD (Table 1). Alcohol abstainers (ND; defined as alcohol intake of 0 g day−1) were included (view Supplementary Table S3 for description of the ND group, view Supplementary Table S4 for sensitivity analysis). In general, age and BMI was comparable between MHD and LD. A significantly lower age could be observed in MHD of KORA F3 males and TwinsUK participants (P-value 1.3E−02 and 1.6E−02, respectively). BMI was significantly increased in MHD in male KORA F4 participants (P-value 3.3E−03). The proportion of smokers was significantly higher in MHD in KORA F4 male and TwinsUK female populations (P-values 1.0E−04 and 1.3E−02, respectively). In all the three studies, there was a significant increase in HDL in MHD compared with LD (P-values 7.1E−12–1.3E−02). Except in KORA F3, the mean HDL was increased, but P-value was not significant. Significant increase of mean triglyceride concentration could be observed in KORA F4 male MHD only (P-value 3.4E−02).

Table 1 Basic characteristics of the discovery and replication data sets

Analysis of global metabolite concentration differences between MHD and LD

We identified 40 metabolites in males and 18 metabolites in females using logistic regression analysis (adjusted for age, BMI, smoking, HDL and triglycerides) that significantly differed (P-value <3.8E−4) in concentration between MHD and LD in the KORA F4 study (view Supplementary Table S2 for detailed P-values and direction). To illustrate the trend of metabolite levels with increasing 5 g day−1 alcohol consumption increments, heat maps were displayed based on normalized mean metabolite residuals for each of the 40/18 male/female metabolites. Hierarchical clustering with Euclidean distance was used in order to find similar metabolite groups. The final clusterogram (display of dendogram and heat map) resulted in two main clusters C1 and C2 both in males and females (Figure 1). C1 consists of metabolites that increase in concentration with increasing alcohol consumption (high in MHD and low in LD). In contrast, C2 consist of metabolites that decrease in concentration with increasing alcohol consumption (low in MHD and high in LD). PC aa Cx:ys, ether lipids (PC ae Cx:ys), lysoPC a Cx:ys and sphingomyelins (SMs) occurred in both males and females. Only the acylcarnitine C16:1 occurred in males. All PC ae Cx:ys and SMs were decreased in MHD in males and females. PC aa Cx:ys were increased in MHD compared with LD in males and females (except PC aa C32:3, which was decreased in MHD in females). All lysoPC a Cx:ys were increased in MHD in males and females (except lysoPC a C17:0).

Figure 1
figure 1

Alcohol-specific metabolomic profiles. Clusterograms show 40 and 18 metabolite concentrations in relation to alcohol consumption in light drinkers (LD) and moderate-to-heavy drinkers (MHD) in (a) males and (b) females, respectively. The additional two-column clusterogram shows the effect of lipid-lowering medication (that is, statins, fibrates, herbal-based lipid-lowering agents) on metabolite concentrations in non-drinkers (ND). Relative concentration of metabolites are represented by x-fold s.d. from overall mean concentrations for groups of alcohol consumption of 5 g day−1. Horizontal axis displays the alcohol concentration in g day−1, while vertical axis represent hierarchical clustering. The 10/5 most significant metabolites separating MHD from LD in males/females are highlighted in blue and pink. (c) Graphic shows receiver operating characteristic (ROC) curves for the set of most significant 10/5 metabolites in males (PC aa C32:1, PC aa C36:1, PC aa C36:5, PC aa C40:4, PC ae C40:6, lysoPC a C17:0, lysoPC a 18:1, SM (OH) C22:1, SM (OH) C22:2, SM (OH) C16:1) and females (PC aa C34:1, PC ae C30:2, PC ae C40:4, lysoPC a C16:1, lysoPC a 17:0). ROC curve displayed as dotted/crossed line represent marker performance in males/females. The area under the ROC curve was calculated for the combined metabolite panel with adjustment for age, body mass index, smoking status, high-density lipoproteins and triglycerides.

The logistic regression analysis was based on each single metabolite, and some of these 40/18 male/female metabolites are expected to correlate with each other. To find more specific and independent metabolites that best separate MHD from LD as potential biomarkers for alcohol-consumption, we further applied Random Forest and Stepwise Selection method. Ten metabolites in males (PC aa C32:1, PC aa C36:1, PC aa C36:5, PC aa C40:4, PC ae C40:6, lysoPC a C17:0, lysoPC a C18:1, SM (OH) C22:1, SM (OH) C22:2, SM (OH) C16:1) and five metabolites in females (PC aa C34:1, PC ae C30:2, PC ae C40:4, lysoPC a C16:1, lysoPC a C17:0) were further selected (Figure 1). To evaluate the model of the combination of the 10/5 male/female specific metabolites with covariates (that is, how good does the logistic regression model adjusted for age, BMI, smoking, HDL and triglycerides distinguish between MHD and LD), the AUC was calculated. The AUC value in males was 0.812 and in females 0.679 (Figure 1).

Replication analysis in two independent cohorts

Replication analysis of the most significant 10 alcohol-related metabolites in males and five metabolites in females found in KORA F4 discovery sample was performed in two independent KORA F3 and TwinsUK cohorts (Tables 2 and 3). In males, 3 out of 10 metabolites (that is, PC aa C32:1, PC aa C36:1, SM (OH) C16:1) could be replicated in KORA F3 (Table 2). In females, two out of five metabolites could be replicated (Table 3); one metabolite in KORA F3 (that is, PC ae C30:2) and one metabolite (that is, PC aa C34:1) in TwinsUK. In the TwinsUK population, only females were available for replication analysis. In all, 629 TwinsUK participants met the inclusion criteria and were eligible for the replication analysis; however, only for 277 participants HDL and triglyceride data were available for the same time point. In TwinsUK, we performed the replication analysis using 277 and 629 study participants. In the first replication analysis on 277 participants, logistic regression adjusted for age, BMI and smoking, HDL and triglyceride resulted in no significant P-values. When we increased the sample size to 629 and used the logistic regression model adjusted for age, BMI and smoking, the metabolite PC aa C34:1 could be replicated.

Table 2 Results of logistic regression analysis of alcohol-specific metabolites in males
Table 3 Results from logistic regression analysis of alcohol-specific metabolites in females

Additionally, we pooled data from the KORA F4 discovery and KORA F3 replication samples and conducted a meta-analysis with a fixed effect model in order to investigate the combined effect of alcohol on metabolite concentrations. In the meta-analysis, the replication succeeded for all 10 metabolites in men and 5 metabolites in women. This indicates that due to the small sample size in TwinsUK and KORA F3 cohorts the previous replication could not be achieved for all metabolites. Nevertheless, the trends of metabolite concentrations (as stated by the comparison of means of metabolite concentrations between MHD and LD in Tables 2 and 3) for all 10 and 5 metabolites are consistent with the trends in the discovery across all studies. For example, the metabolite lysoPC a C18:1 was not replicated in KORA F3 and TwinsUK, still the mean metabolite concentration is higher in MHD compared with LD throughout the KORA F4, KORA F3 and TwinsUK studies.

Discussion

In the current study, we used a targeted metabolomics approach and identified, as well as partly replicated, alcohol-related metabolites in German and UK human studies. Our results suggest that alcohol affects mostly the sphingolipid, glycerophospholipid and ether lipid metabolism. A schematic overview of the observed alcohol-specific metabolic differences and the potential underlying mechanisms is depicted in Figure 2 and are discussed below.

Figure 2
figure 2

Schematic overview of metabolite concentration differences in moderate-to-heavy drinkers (MHD) compared with light drinkers (LD) in males and females. Ten/five metabolites that best discriminate MHD from LD in males/females are shown. Yellow and blue boxes represent male- and female-specific alcohol-related metabolites identified in this study. Combined yellow-blue boxes represent metabolites identified both in males and females. Bold black arrows represent observed higher or lower of metabolite concentration in MHD compared with LD in the discovery. Replicated metabolites are marked by a star. Thin black arrows represent the higher or lower of alcohol-related analytes in MHD reported in earlier publications. Red boxes represent alcohol-related enzymes and red arrows represent the effect on the respective enzyme activity or concentration reported in previous publications in MHD. ASM, acid sphingomyelinase; LCAT, lecithin-cholesterol acyltransferase; PAF, platelet-activating factor; PLA2, phospholipase A2; PLD, phospholipase D.

The underlying mechanism for lower sphingomyelin concentrations (SM(OH)C16:1, SM(OH)C22:1, SM(OH)C22:1) in MHD compared with LD could be attributed to acid sphingomyelinase (ASM) activity. ASM catalyzes the hydrolysis of sphingomyelins by cleaving the phosphodiester bond of sphingomyelins generating ceramide and phosphorylcholine,35, 36 which is again reassembled to phosphatidylcholine.3 Enzymatic dysfunction of ASM results in Niemann–Pick disease A (NPD-A, OMIM 257200) and B (NPD-B, OMIM 607616), a lipid storage disease characterized by accumulation of sphingomyelins within the endo-lysosomal compartment.37 Interestingly, this mechanism is reciprocal when alcohol is administered. Several studies investigating cellular response to alcohol in vitro and in vivo have provided evidence that alcohol stimulates the ASM activity leading to accumulation of ceramide and decrease of sphingomyelins.36, 38, 39, 40, 41 A recent in vivo study on patients with alcohol dependence reported alcohol-induced release of phosphorylcholine from sphingomyelins in the peripheral blood cells confirming alcohol-induced activation of ASM.42

There is a direct correlation between PC concentrations and phosphatidylethanol (PEth). PEth is a clinical biomarker of the past 1–2 weeks of moderate-to-heavy alcohol consumption.43 PEth is a unique phospholipid that is synthesized only in the presence of ethanol and is directly formed from PCs by the enzyme phospholipase D44, 45, 46 that catalyzes the exchange of ethanol for choline in PCs.46 Different PEth molecular species have a common phosphoethanol head group onto which two fatty acid moieties derived from PCs are attached. A study by Helander and Zheng47 has shown that PEth-16:0/18:1 (34:1) was the most predominant molecular species accounting for 37% of all PEth species. A recent study by Nalesso et al.48 compared the occurrence of different PEth species between heavy drinkers and social drinkers (defined as daily alcohol intake 60–300 and 0–20 g day−1, respectively). Interestingly, PEth 16:0/18:1 (34:1), PEth 18:0/18:1 (36:1) and PEth 16:0/16:1 (C32:1) were most abundant in heavy drinkers. This may be consistent with our findings in which PC aa C34:1 in female, PC aa C36:1 and PC aa 32:1 in male had higher concentration in MHD compared with that in LD. We hypothesize that concentrations of specific PC species can be used as surrogate biomarkers for PEth to distinguish MHD from LD. However, PEth measurements are out of scope of this study. Dedicated and parallel measurements of PC aa C34:1 and PEth (34:1) would be required in order to investigate whether PC aa C 34:1 can be a substitute PEth (34:1).

lysoPCs are derived from PCs49 and have been reported to have cytotoxic effects.50 They accumulate in alcohol-related conditions as in atherosclerosis51 or ischaemia.52 LysoPCs originate from several metabolic pathways, as part of the production is attributed to the transesterification of PCs and free cholesterol catalysed by the enzyme lecithin-cholesterol acyltransferase (LCAT), where LCAT hydrolyses the sn-2 acyl group and subsequently transfers and esterifies the fatty acid to free cholesterol.53 A study by Goto et al.,54 investigating clinical alcoholics, reported an increase of LCAT concentration in individuals with alcohol intake of >30 g day−1. Another metabolic pathway generating lysoPC species is attributed to the enzyme phospholipase A2, which catalyzes the hydrolysis of an ester bond at the sn-2 position of 1,2-sn-diacylglyceroIs yielding lysoPCs and free fatty acids,55 which are esterified into fatty acid ethyl esters that have been reported as alcohol marker to distinguish social from heavy drinkers or alcohol-dependent individuals.56, 57

Fatty acids with uneven number of carbons (that is, C15:0 and C17:0) are produced by bacterial flora of human intestine.58 It is known that alcohol acts as a disinfectant which kills bacteria. Thus a possible explanation for the lower concentrations of lysoPC a C17:0 in MHD could be that alcohol consumption leads to the disruption of the respective intestinal bacterial microflora in the gut which thus influences lysoPC a C17:0 levels in human blood. On the other hand, the fatty acid C17:0 is also found in the bacterial flora of ruminants.59, 60 A study by Wolk et al.61 revealed that portions of the fatty acids C15:0 and C17:0 in adipose tissue reflected milk fat consumption in women. An earlier study62 investigating associations of reported alcohol intake with dietary habits in probands from the EPIC cohort found that alcohol consumers had a lower intake of dairy products than abstainers. This is consistent with another French cohort of the EPIC study,63 which found that high alcohol intake was associated with lower consumption of dairy products in both genders compared with moderate alcohol consumption. Thus another plausible explanation to the lower concentrations of lysoPC a C17:0 in MHD in our study could be based on lower intake of dairy products. Based on the above findings and explanations, lysoPC a C17:0 might also be a dietary biomarker associated with distinguished dietary behavior of MHD compared with LD rather than a biomarker for alcohol-induced toxic or inflammatory mechanisms.

Ether lipids (for example, PC ae C30:2 and PC ae C40:6) have a role as precursor of platelet-activating factor.64, 65 Platelet-activating factor is an important mediator in hemostasis and has an important role in platelet aggregation (that is, thrombotic effects). A number of studies indicate that ethanol directly affects hemostasis via a number of mechanisms, including platelet aggregation and activation.66, 67, 68, 69 This mechanism is still not fully understood; however, based on our results, it can be hypothesized that reduced platelet-activating factor levels in response to moderate-to-heavy alcohol consumption might form a bottleneck in the process of platelet activation leading to poor platelet aggregation and to alcohol-related hemorrhagic events. This is supported by studies from the United States and Sweden showing that the baseline incidence of acute upper gastrointestinal bleeding increased by threefold as alcohol consumption increased from 1 drink to >20 drinks per week.70

Conclusion and outlook

Our study provides new insights into the impact of alcohol consumption on human metabolism. Our results suggest that metabolomic profiles based on PCs, lysoPCs, ether lipids and sphingolipids form a new class of biomarkers for alcohol consumption. This may be of great value for the clinical assessment of alcohol use, alcohol-specific disease detection and drug-therapy monitoring. Side effects of alcohol consumption on specific organs as liver could be investigated by future studies using an association study analysing metabolite concentrations in relation to concentrations of liver biomarkers as, for example, γ-glutamyltransferase.71 The current analysis is based on a targeted metabolomics approach that is limited to a subset of 131 currently known metabolites in human (for example, lipid metabolism, amino acid metabolism). A study using a broader metabolomics approach that quantifies a bigger number of metabolites would be needed to investigate alcohol effects on other areas of metabolism. Further research is needed to elucidate the exact underlying mechanisms. A prospective study in large sample would help validate the predictive potential of these results.