1 Introduction
Over the past 50 or so years, the key physiological determinants of endurance exercise performance have emerged. These include maximal oxygen uptake
\((\dot{V}{\text{O}}_{2\hbox{max} } )\), the lactate threshold, and efficiency. In the case of distance running, efficiency is typically referred to as running economy because it is difficult to calculate efficiency in a strict engineering context in running humans [
1]. By contrast, it is much easier for cycling.
Data on these three variables can be modeled to predict performance, and there are field tests that incorporate several of these variables that are also highly predictive of performance. For example, in the early 1990s I took emerging evidence that humans run the marathon at a pace similar to their running speed at lactate threshold, and calculated a theoretical upper limit, at least at that time, for the ‘fastest’ potential marathon performance by men [
2]. This model also reasonably predicted the performance of a given individual. Likewise, so-called velocity at
\(\dot{V}{\text{O}}_{2\hbox{max} }\) was shown to be highly correlated with running performance [
3]. This latter measure incorporates both
\(\dot{V}{\text{O}}_{2\hbox{max} }\) and running economy into one metric.
The basic idea underpinning these factors is that they interact in a predictable way. \(\dot{V}{\text{O}}_{2\hbox{max} }\) can be seen as the upper limit of aerobic capacity, the lactate threshold related to the fraction of \(\dot{V}{\text{O}}_{2\hbox{max} }\) that can be sustained for a duration longer than a few minutes, and efficiency or economy related to the actual power output or speed during a race that can be generated at a given V̇O2. Additionally, the physiological determinants of \(\dot{V}{\text{O}}_{2\hbox{max} }\) and the lactate threshold are well understood. Less is known about the physiological determinants of efficiency/economy. The question then is, if the physiological determinants of \(\dot{V}{\text{O}}_{2\hbox{max} }\) and the lactate threshold are well understood, what is known about the contribution of DNA variation to these factors?
Before I go on, I want to share two sets of assumptions related to the physiology behind
\(\dot{V}{\text{O}}_{2\hbox{max} }\) and the lactate threshold. First, for
\(\dot{V}{\text{O}}_{2\hbox{max} }\), the primary physiological determinants under most circumstances in most humans are related to maximum cardiac output and stroke volume, along with red cell mass or total body hemoglobin [
4]. In other words, the ability of the heart to pump large quantities of oxygenated blood to the contracting skeletal muscles is absolutely critical. While this is not true in every case and in every circumstance, for example chronic obstructive pulmonary disease (COPD), where the lungs can become limiting, it is true for the vast majority of situations. Second, the lactate threshold reflects, in large part, some combination of skeletal muscle mitochondrial content and function in the contracting skeletal muscles and perhaps capillary density [
5,
6]. Efficiency/economy is much more complex and likely sport-specific. It also has an element of the competitive medium that needs to be considered. Examples include wind resistance during high-speed cycling versus lower-speed running, or water resistance for sports such as swimming or rowing [
1].
Therefore, with this general perspective as a background, I will next try to ask what is known about the genetic contributions to the major physiological determinants of endurance exercise performance. A key question then is what constitutes ‘genetic’. One approach is to focus on the heritability of key traits related to athletic performance. These are typically statistical arguments based on the correlation of a given trait between family members, most notably mono- or dizygotic twins. If the correlation between monozygotic twins is greater than the correlation between dizygotic twins then the interpretation is that this similarity is due primarily to greater similarities in the DNA of monozygotic twins than dizygotic twins [
7]. For
\(\dot{V}{\text{O}}_{2\hbox{max} }\), the heritability can be very high for monozygotic twins, consistent with the idea that there is a major genetic component to this variable. Twin (and other family) studies also indicate that there is a significant genetic component to the increase in
\(\dot{V}{\text{O}}_{2\hbox{max} }\) seen with a few months of fitness-type training [
8,
9].
While the observations highlighted above suggest there is a strong genetic component to training, specific DNA variants associated with
\(\dot{V}{\text{O}}_{2\hbox{max} }\) and how
\(\dot{V}{\text{O}}_{2\hbox{max} }\) responds to training have been hard to find. While a number of small effect size variants considered in concert seem related to the rise in
\(\dot{V}{\text{O}}_{2\hbox{max} }\) with training, no variants alone or in combination that are clearly linked to canonical biological pathways likely to underpin cardiac output and red cell mass have been identified [
10‐
13].
The issue of limits of genetic ‘causation’ is also part of a general trend in genomic research for complex human traits that has accelerated in recent years following the completion of the human genome project. In the late 1990s and early 2000s, it was generally assumed that a limited number of gene variants would explain much of the risk of developing common non-communicable diseases. The idea was that once these variants were identified, a host of new approaches to diagnosis, prevention, and therapy would emerge. Unfortunately, this vision has not been realized and hundreds of gene variants with small effect sizes have been associated with complex non-communicable diseases. Importantly, their role in the diagnosis, prevention, and therapy for these diseases remains obscure. These larger issues related to genomics and complex disease-related traits have been discussed in detail elsewhere [
14].
3 Limitations and Potential Objections to This Perspective
There are a number of potential limitations to the perspectives outlined above. The most obvious is that very large cohorts of subjects (perhaps numbering in the hundreds of thousands) in conjunction with the phenotypes of interest and DNA sequence information are simply not available for the key steps in the oxygen transport cascade discussed in this review. For this sort of cohort to be a reality, beyond a blood test for genotyping, detailed measurements of gas exchange at rest and during submaximal and maximal exercise would be needed. Measurements of cardiac output and red cell mass would also be needed, as would serial measurements of blood lactate during graded exercise. Muscle biopsies to assess fiber type, mitochondrial function, and capillary density would also be essential. The financial and logistical barriers to such a research program seem formidable to say the least.
However, if such a cohort ever did emerge, it seems likely, based on the data from other phenotypes, that very large numbers of variants with very small effect sizes (relative risks of 1.1–1.5 are typically reported) would emerge [
33]. Additionally, any rare DNA variants found in smaller case-control-like studies would likely show declining penetrance, and thus explain less of the physiology in any larger cohorts [
34]. Importantly, the extent to which these variants would be causally or ‘casually’ associated with the physiological phenotype of interest would be uncertain, as would their overall explanatory power. To address these limitations in the studies of common disease risk, so-called polygenic gene scores have been developed [
35]. However, the predictive utility of these scores is questionable for many complex phenotypes (e.g. obesity, diabetes, hypertension), and the overall genetic contribution to the phenotype of interest is much less than environmental and behavioral influences [
36].
A final cautionary note is that for many complex human phenotypes, genetic association studies can have reproducibility issues, and also require diverse ethnic cohorts. The classic example of the reproducibility problem comes from studies of depression where a recent report found essentially no significant and reproducible genetic associations for depression [
37].
4 Conclusions
The above discussion of the oxygen transport cascade shows that while there is evidence, based on family and twin studies, for a genetic component of
\(\dot{V}{\text{O}}_{2\hbox{max} }\) and its trainability, it has been difficult to reconcile these observations with any specific large effect size gene variants or combinations of small effect size variants linked to key physiological pathways as a whole. Similar comments can be made about peripheral adaptations in skeletal muscle, and the determinants of efficiency are almost certainly complicated by biomechanical and skill-related factors as much as they are by genetic components. For considerations such as body size, similar observations can be made, and even in the case of ACTN3 variants associated with sprinting or power performance, the effect sizes are tiny and there are examples of elites with the ‘wrong’ genotype [
38,
39]. Additionally, in some sports such as swimming, the ACTN3 genotype does not clearly segregate in sprinters versus endurance athletes [
40].
The obvious question is why? One emerging concept is that there are many potential genetic pathways to a given phenotype [
41]. This concept is consistent with ideas that biological redundancy underpins complex multiscale physiological responses and adaptations in humans [
42]. From an applied perspective, the ideas discussed in this review suggest that talent identification on the basis of DNA testing is likely to be of limited value, and that field testing, which is essentially a higher order ‘bioassay’, is likely to remain a key element of talent identification in both the near and foreseeable future [
43]. While it is possible that more explanatory DNA-based associations for complex exercise-related traits might emerge if detailed physiological phenotyping of large cohorts of humans is performed, there are many limitations to this perspective. In this context, the advocates of ever-bigger Ns should carefully review the limits of this approach from studies of other complex phenotypes as they make the case for a ‘more is better’ approach to future studies.
Acknowledgements
This supplement is supported by the Gatorade Sports Science Institute (GSSI). The supplement was guest edited by Lawrence L. Spriet, who attended a meeting of the GSSI Expert Panel in March 2019 and received honoraria from the GSSI, a division of PepsiCo, Inc., for his participation in the meeting. Dr. Spriet received no honorarium for guest editing the supplement. Dr. Spriet suggested peer reviewers for each paper, which were sent to the Sports Medicine Editor-in-Chief for approval, prior to any reviewers being approached. Dr. Spriet provided comments on each paper and made an editorial decision based on comments from the peer reviewers and the Editor-in-Chief. Where decisions were uncertain, Dr. Spriet consulted with the Editor-in-Chief.