Main findings
The prevalence of GDM in the Growing up in New Zealand study varied significantly between data sources. Using data from all sources, GDM prevalence was 6.2%. When this analysis was restricted to medical data only, GDM prevalence was 5.4%. The prevalence of GDM found in the Growing Up in New Zealand study cohort was 68% greater than the prevalence from the National Minimum Dataset for the same geographical area during the same time period.
Where data from the Growing Up in New Zealand cohort were available from multiple sources, data were conflicting for 3.6% of women and levels of agreement for a diagnosis of GDM were poor. We found discrepancies in self-reported data when compared to medical data in which a third of women with a diagnosis of GDM according to medical data reported having no diagnosis of diabetes in self-reported data.
Interpretation
Diagnosis of GDM according to medical records is frequently considered to be a gold-standard data source estimating the prevalence of GDM in a population [
33‐
35]; however, review of medical records is labour-intensive, expensive and access to records restrictive. Population health datasets are frequently used to determine disease prevalence and are derived from coding of medical diagnoses present in clinical records [
31], but their accuracy has been questioned [
21,
35]. Self-reported data have been suggested to be an accurate alternative data source for estimating the prevalence of GDM [
33,
36,
37]. However, the substantial differences in GDM prevalence seen according to different data sources in the Growing Up in New Zealand study and between the Growing Up in New Zealand cohort and the National Minimum Dataset highlight significant deficiencies in using just one data source to determine GDM prevalence. Where data were available from multiple sources, data were conflicting for 3.6% women and levels of agreement between data sources for presence of GDM were poor.
Other studies evaluating the prevalence of GDM in routinely collected population health datasets have shown similar findings [
34,
35,
38,
39]. Zheng, Morris and Moses [
38] determined the prevalence of GDM in a private hospital according to the hospital’s records and laboratory results and compared this to the New South Wales Perinatal Data collection. Much like the findings in our study, there were discrepancies in GDM prevalence according to different data sources and both hospital records and the Perinatal Data collection underestimated the prevalence of GDM. For women who were missing a diagnosis of GDM in the Perinatal Data collection, about half had a diagnosis of GDM documented in the medical records and half were not documented in the women’s medical notes [
38]. Bell et al. [
34] compared information on maternal diabetes status extracted from medical records of a random sample of 1200 women giving birth in New South Wales, Australia and compared this to two New South Wales Department of Health routinely collected datasets. Both datasets underestimated the prevalence of GDM when compared to medical records and given the findings of Zheng, Morris and Moses [
38], where half the cases of GDM were not documented in the medical notes, the discrepancy between the prevalence of GDM recorded in the datasets and the true prevalence of GDM could in fact be even greater.
Other studies have suggested self-reported data provide an accurate estimate of GDM prevalence [
33,
36,
37]. Gresham et al. [
37] investigated the agreement between self-reported perinatal outcomes, collected through repeated surveys, and medical records in the Australian Longitudinal Study on Women’s Health. When women were asked specifically about each of their pregnancies, there was an agreement of 97.8%, Kappa 0.66 (
P < 0.001) between self-reports and medical records for GDM [
37]. Similarly, in the New York State Pregnancy Risk Assessment Monitoring System (PRAMS) study, Hosler, Nayak and Radigan [
33] examined agreement between participating women’s self-report and maternal GDM documented on their children’s birth certificates and found percent agreement to be 93.8% with a Kappa statistic of 0.53. Despite these seemingly high levels of agreement, the Kappa statistic used in these studies is testing the correlation between the two reports of GDM, but does not test their level of agreement [
32]. Using the data provided by Gresham et al. [
37] the proportions of agreement between self-reported data and medical records can be calculated to be 0.51 (95% CI 0.47, 0.55) for the presence of GDM and 0.98 (95% CI 0.97, 0.98) for the absence of GDM, very similar to our findings. These data also show that 2.2% of women misreported their GDM status according to medical records in the study by Gresham et al. [
37], comparable to the 2.7% found in our study, and 6.2% of women misreported their GDM status in the study by Hosler, Nayak and Radigan [
33]. These results question the validity of using self-report as the only data source for estimating GDM prevalence. More importantly, any number of women who misinterpret their diagnosis is likely to have unfavourable consequences. Appropriate treatment of GDM, even in mild cases, has been shown to reduce the risk of adverse pregnancy outcomes [
40]. Our finding that a third of women with a diagnosis of GDM according to medical data did not report having any form of diabetes when asked in interview administered questionnaires raises the question as to whether these women received or adhered to treatment for GDM and warrants further investigation. The greater proportion of women reporting to have GDM and lower incidence of misreporting their diagnosis when compared to medical data at the post-partum time point compared to the antenatal time point could be due to women being diagnosed with GDM after the antenatal questionnaire but could also be due to the difference in interview technique used.
Researchers, healthcare organisations, policy makers and funders rely on prevalence statistics for service planning, policy development and funding allocation. The findings in our study and others’ [
34,
38,
39] indicate that commonly used prevalence statistics are likely underestimating the true prevalence of GDM. By using multiple data sources to determine GDM prevalence, we were less likely to miss any diagnoses of GDM and therefore give a more accurate estimate of GDM prevalence.
Strengths and limitations
To our knowledge this is the first study evaluating the proportions of agreement between different data sources for the presence and absence of GDM in a population. Although effort was made to have a consistent approach to data collection, not all DHBs provided the same type of information when diabetes coding status was requested using NHI linking. CMDHB provided data on diabetes coding based on ICD-10 codes from their hospital database, while ADHB provided data extracted from their maternity database, and WDHB matched NHIs to their diabetes clinic database and therefore only provided information on women who were registered with the diabetes clinic resulting in a significant number of missing data from ADHB and WDHB. Furthermore, while all three DHBs used a 75 g OGTT with the same fasting and 2-h plasma glucose thresholds for diagnosis as their formal diagnostic test, CMDHB additionally used a 50 g screening test for which a plasma glucose concentration at 60 min of ≥11.1 mmol/L was considered diagnostic of GDM [
21]; thus, the diagnosis of GDM was not made consistently across the cohort. The nature of the different data sources give different denominators when calculating prevalence. For example, the laboratory data includes only those women who were screened for GDM, whilst the Ministry of Health National Minimum Dataset includes all women who delivered at a New Zealand Hospital. Furthermore, although the self-reported data included data collected from more than one time point, the wording used in the interview administered questionnaires did not specifically ask about GDM per se and could be open to interpretation and misclassification in coding. The participants’ understanding of these questions could also be influenced by factors such as level of education, the care they received during pregnancy and pregnancy outcome, and may have affected their responses. While these differences may limit the robustness of the data, a major strength of our study is that by pooling results from multiple data sources, we were able to overcome the deficiencies of the different data types to give a more accurate estimate of GDM prevalence. An additional strength is that the prevalence of GDM calculated from NHI linked data from the Ministry of Health of 3.8% was almost identical to the 3.7% prevalence found in the National Minimum Dataset for the same geographical area. This suggests that the cohort of women in the Growing Up in New Zealand study were broadly representative, at least with respect to risk factors for GDM, to all women giving birth in the catchment area at the time. We acknowledge that the data used to determine prevalence of GDM in this cohort were collected 10 years ago and may not reflect current GDM prevalence. However, to date this is the largest study to estimate GDM prevalence in New Zealand and provides a reference for future research and raises important points to consider when utilising or collecting prevalence statistics.