Background
The definition of gestational diabetes mellitus (GDM) as any degree of glucose intolerance with onset or first recognition during pregnancy is largely accepted. However, the precise level of glucose intolerance characterizing gestational diabetes has been controversial over the last three decades.
In 1979-1980, U.S. National Diabetes Data Group (NDDG) [
1] and the World Health Organization (WHO) [
2] established that the 2 h 75 g oral glucose tolerance test (OGTT) should be the main diagnostic test for glucose intolerance outside of pregnancy.
Regarding glucose intolerance during pregnancy, two different approaches were taken. The NDDG opted, in pregnancy, to maintain the 3 h 100 g OGTT test, largely used and evaluated in the USA. The American Diabetes Association (ADA) and many other medical associations around the world adopted over the years this 3 h 100 g OGTT test. In so doing, different cutoffs for the diagnosis of GDM were chosen, one of the issues being the difficulty in converting blood glucose values from the original studies done in the 1960s and 1970s [
1,
3‐
5] to their plasma equivalents analyzed using new analytic methods.
The WHO adopted the 2 h 75 g OGTT in pregnancy, recommending the same diagnostic cut points established for the diagnosis of impaired glucose tolerance outside of pregnancy [
2,
3]. In 1999, WHO clarified that GDM encompassed impaired glucose tolerance and diabetes (fasting ≥ 7 mmol/l or ≥ 126 mg/dl; 2 h plasma glucose ≥ 7.8 mmol/l or 140 mg/dl) [
6] and, over the years has maintained their recommendations.
More recently, the International Association of the Diabetes in Pregnancy Study Group (IADPSG), after extensive analyses of the Hyperglycemia and Adverse Pregnancy Outcomes (HAPO) study [
7], recommended new diagnostic criteria for GDM [
8] based on the 2 h 75 g OGTT: a fasting glucose ≥ 5.1 mmol/L (92 mg/dl), or a one hour result of ≥ 10.0 mmol/L (180 mg/dl), or a two hour result of ≥ 8.5 mmol/L (153 mg/dl).
A considerable number of prospective studies have now investigated the use of a 2 h 75 g OGTT in pregnancy in relation with various pregnancy outcomes, thus allowing evaluation of these two main diagnostic criteria. Thus, the purpose of this study is to summarize, through a systematic review, the association of GDM, as diagnosed by the WHO and the IADPSG criteria, with adverse pregnancy outcomes, in untreated women. In so doing, the applicability of the IADPSG criteria to non-HAPO settings is also evaluated.
Discussion
This is the first systematic review to assess the magnitude of the associations between different GDM diagnostic criteria and several clinically relevant outcomes. We focused analyses on the two main diagnostic criteria currently under debate for a 75 g OGTT - i.e., those recommended by the WHO and those recently proposed by the IADSPG on the basis of pregnancy outcomes. In addition to providing estimates for the magnitude of the increased risk predicted by these two criteria, we also evaluated the application of the IADPSG criteria to settings other than that of the HAPO study.
Our summary estimates of relative risk demonstrate that GDM diagnostic criteria based on both the WHO and the IADPSG criteria predict perinatal and maternal adverse outcomes. The strength of the crude associations found ranged from 1.23 (95% CI 1.01-1.51) for cesarean delivery, to 1.81 (95% CI 1.47-2.22) for macrosomia. For the three outcomes for which meta-analyses were possible for both criteria (large for gestational age, preeclampsia and cesarean delivery), the magnitude of the effects were similar for the WHO and the IADPSG criteria (1.53 vs. 1.73; 1.69 vs. 1.71; 1.37 vs. 1.23, respectively), although the inconsistency across studies limited aggregate estimation for the IADPSG criteria. Sensitivity analyses excluding either the HAPO or the EBDG study did not materially change the magnitude of these associations (changes varying between 1 and 13%).
It is important to note that these crude associations are very small within a diagnostic context. Two reasons may explain the small associations found. First, both GDM criteria, especially the IADPSG criteria, identify lesser degrees of hyperglycemia when compared to other ones, such as those previously recommended by the ADA [
20]. Second, as all the studies analyzed in this review excluded women receiving specific treatments for GDM (see Table
1), the range of hyperglycemia classified as GDM represents a mild degree of hyperglycemia. Given the continuum of risk in the association between plasma glucose and pregnancy outcomes [
7], if both criteria were applied to a broader spectrum, such as the one seen in the usual clinical setting, which includes women at greater risk given their higher glucose level, the association should be stronger. Nevertheless, even if GDM diagnostic criteria were to reach relative risks close to 3 for these adverse outcomes in such settings, the relative risks would still be unlikely to reflect major diagnostic discrimination in terms of post test probabilities [
21]. This fact suggests the importance of investigating the contribution to risk discrimination of other factors, besides glycaemia, for these outcomes.
It is also important to interpret the heterogeneity found across studies, most seen for the IADPSG criteria. Potential reasons for heterogeneity include different population characteristics, study design and nature of the diagnostic criteria. As sensitivity analyses examining the influence of the EBDG and the HAPO studies did reveal some changes in the heterogeneity found, particularities about each of these study settings need to be considered. The HAPO study is a large multi-country study conducted from 2000 to 2006 with a strict research protocol. The EBDG study is a multicenter study conducted in Brazil in the 1990's with a less strict protocol, in a scenario of less intervention, following women with a wider range of hyperglycemia. A more strict protocol, with more control over incomplete fasting, such as that seen in the HAPO study, could produce larger associations with the IADPSG criteria, which diagnoses an appreciable fraction of cases on the basis of the fasting value. In fact, the application of the IADPSG criteria in two published studies [
13,
22] and in the EBDG database showed that the fasting value identified over 70% of all cases of GDM so defined, while when these criteria were applied to the HAPO study as a whole, the fasting value identified only about 50% of cases. However, as this rate in HAPO varied from 24% (Thailand) to 74% (Barbados)[
23], whether these differences resulted from incomplete fasting or from other specific study or population particularity cannot be concluded from current information. The lack of blinding to glucose levels in most studies (except HAPO) could lead to GDM treatment, and thus reduce the magnitude of the associations; so we excluded such women. Although undetected intervention may still be present even after these exclusions, for example, diet, it is unlikely that this would cause more heterogeneity in the IADPSG than in the WHO analyses.
One hypothesis is that the IADPSG criteria are more vulnerable to heterogeneity across different settings because they allow that diagnosis be made on the basis of only one out of three possible measures (fasting, 1 h and 2 h). Given population variability in terms of the probability of being positive by fasting and post load values, as well as in terms of the possibility of having incomplete fasting (drank coffee or tea with sugar; for example), more heterogeneity could be found for the IADPSG criteria. Another possibility, worth exploring in future studies, is whether the heterogeneity stems from differences in the prevalence or characteristics of obesity in the underlying populations.
Additionally, since the IADPSG criteria were derived from the HAPO study, lower performance of these criteria in non-HAPO settings is to be expected. For large for gestational age and for cesarean delivery, results remained inconsistent across studies after excluding HAPO, which makes questionable the estimates of pooled RRs generated for these outcomes (the pooled RRs found were lower and not statistically significant). For preeclampsia, results across studies became consistent, but with an RR (1.54; 95% CI 1.32-1.79) smaller (p = 0.006) than that found for the HAPO study (2.02; 95%CI 1.78-2.29).
Our study has some limitations. First, few studies were available to evaluate important outcomes such as perinatal mortality and long-term outcomes in offspring. Yet, positive associations were found for macrosomia and pregnancy related hypertension, two clinically relevant outcomes. Second, as we excluded studies conducted with selective screening and studies not allowing analysis of untreated women, we eliminated several otherwise good studies which were included in other reviews on GDM screening [
24]. Publication bias could not be excluded because of the small number of studies examined.
Our study also has several strong points, including its originality, extensive search strategy, inclusion of studies independent of language, strict methodological rigor, assessment of study quality, and sensitivity and subgroup analyses to investigate the applicability of the IADPSG criteria in settings other than the HAPO study.
Competing interests
All authors have completed the Unified Competing Interest form, declaring the absence of financial interests that may be relevant to the submitted work.
Authors' contributions
MIS participated in all the aspects of the project and was the overall supervisor. Additional participation was as follows: Writing the protocol: EMW and MRT; developing the search strategy: EMW; searching and selecting trials: EMW, MRT, MAC, MAD, JT; data extraction: JT, EMW, MRT; data analysis: MF; drafting and final review: All. All authors read and approved the final manuscript.