One-half of all child deaths (<5 years) worldwide are in sub-Saharan Africa (sSA)1 and one-third of these deaths occur in the neonatal period, from infection, preterm birth and neonatal encephalopathy1. Stillbirths probably equal neonatal deaths in number, and infections are a major contributor2. Streptococcus agalactiae (group B streptococcus, GBS) causes neonatal early and late onset disease (EOD, LOD), stillbirth3 and possibly contributes to preterm birth4 and neonatal encephalopathy5, as a consequence of ascending maternal genito-urinary colonization (for definitions see Supplementary Table 1). GBS emerged as the leading cause of EOD in the USA in the 1960s6 and subsequently in Europe, but in sSA there are outstanding major questions as to whether GBS commonly colonizes pregnant women, causes stillbirth, or is an important cause of neonatal disease. Establishing this is essential to informing potential preventive interventions. In resource-rich countries, reductions in EOD have followed the introduction of maternal microbiological or risk factor screening with intra-partum antibiotic prophylaxis (IAP)7. However, there is uncertainty as to the feasibility of this approach in resource-poor settings, and there is no evidence of the effectiveness of IAP in preventing GBS-associated stillbirth, or LOD. Antisepsis at delivery has been shown to be ineffective8. However, maternal vaccination may provide a feasible strategy to reduce GBS disease in resource-poor countries. A trivalent conjugate vaccine (serotypes Ia/Ib/III) has completed phase 2 clinical trials9, and a pentavalent vaccine is in development10.

Understanding which women are most likely to be GBS colonized could provide insight into both the emergence of GBS and variation in the reported prevalence of maternal GBS colonization: Europe/USA 5–40% (refs 11,12); Africa 9–47% (Supplementary Table 2). Reported maternal risk factors for colonization are conflicting, with increased maternal GBS colonization reported in both younger13 and older14 age groups, African-American mothers1315 and those with higher education14,16, higher income16, high sexual activity14 and obesity15,16. Data from sSA are limited, but are also conflicting for potentially important risk factors such as HIV infection. In South Africa, maternal GBS colonization was lower in HIV-infected mothers17, but in Malawi, only amongst HIV-infected mothers with lower CD4 counts18. In the USA15 and Zimbabwe19, no association with HIV was found. The limited data from studies in Kenya, Zimbabwe, Malawi and South Africa on colonizing maternal serotypes in sSA suggest serotype III is the most common (serotypes Ia/Ib/II/IV/V are also reported)18,2022.

For neonatal disease, data outside the USA and Europe are sparse23. In sSA, facility-based studies generally report a high incidence of neonatal GBS disease, but population-based and outpatient studies have reported much lower incidences24,25, including what was described as a ‘striking absence’ of invasive neonatal GBS disease in large outpatient-based studies24. However, regional estimates, which included only four studies from Africa (one of which is our study site in Kilifi County)8,2628, suggest that Africa may have the highest regional burden of neonatal GBS disease at 1.2 (0.50–1.91)/1,000 live births23. These limited data suggest that serotype III, as described in other regions23, most commonly causes disease, with rates for EOD and LOD of 52 and 72% in Malawi and 49 and 76% in South Africa; serotypes Ia/Ib/II/V are also reported27,29. The incidence of GBS-associated stillbirth is unknown in sSA3, with data from only two studies (one found no GBS-associated stillbirth30, and the other reported GBS-associated stillbirth in 8/66 (12%) stillbirths31).

The population structure of GBS in Europe and the USA can be described by five major clonal complexes: CC1, CC10, CC17, CC19 and CC23 (refs 32,33), with CC17 overrepresented in disease isolates32,34. These five clonal complexes are also found in Africa32. In addition, CC26 is common in some regions, representing 15% of sampled GBS isolates in Dakar and Bangui35. GBS also causes bovine mastitis, which is largely mediated by the bovine-specific CC67, although the five major human clonal complexes can also be found in cattle33,36,37.

In this study, we aimed to comprehensively describe the clinical epidemiology of maternal GBS colonization, neonatal disease and stillbirth in coastal Kenya, with molecular analysis to determine the associated serotypes, sequence types (STs) and phylogeny.

Results

Maternal GBS colonization and adverse perinatal outcomes

During the study, 10,130 pregnant women attended a health facility and we recruited 7,967 (Fig. 1, sample size Supplementary Table 3). Of these, 526/7,967 (6.6%) were from rural sites, 5,470/7,967 (68.7%) were from semi-rural locations and 1,971/7,967 (24.7%) were from an urban site. There were some differences in demographics in those excluded (Supplementary Table 4), with emergency referrals more likely to be excluded, as well as women with incomplete data on age, ethnicity or parity, although overall numbers were small. Transport times to the laboratory were longer from urban and rural sites (median 11 h (range 0–48 h); 11 h (0–52 h)) compared to semi-rural locations (5 h (0–73 h), but there was no evidence of association between GBS isolation and time to sample processing across all sites (odds ratio (OR) = 1.00 (0.99–1.00), P = 0.6), across rural and urban sites (OR = 0.99 (0.98–1.00)), or each site individually (Supplementary Fig. 1).

Figure 1: Study design and recruitment of participants by study site.
figure 1

a, Recruitment timeline and sub-studies undertaken at each study site. b, Recruitment of mothers into the cohort study. *The denominator for live births in the prospective cohort period, used to calculate incidence of early onset disease (EOD) in KCH excluded those who did not deliver or had a stillbirth (leaving 6,598). **These mothers (7,967) were included in the analysis of risk factors for maternal GBS colonization. §These births (7,833) were included in analyses assessing GBS as a risk factor for stillbirth or perinatal death. §§These live births (7,408) were included in analyses assessing GBS as a risk factor for preterm birth, low birthweight or possible serious bacterial infection. c, Recruitment for the vertical transmission study (maternal–neonatal dyads), a subset of mothers who delivered in KCH. d, Recruitment for stillbirth nested case–control study including mothers who delivered in KCH and had a stillbirth, and controls.

Overall, 934 (11.7% (11.0–12.5%)) women were GBS-colonized at delivery. Prevalence was lowest at rural sites (47/526, 8.9% (6.6–11.7%)), intermediate at semi-rural sites (608/5,470, 11.1% (10.2–12.0%)) and highest at the urban site (279/1,971, 14.2% (12.6–15.8%); trend P < 0.001). However, after adjustment for other risk factors (including maternal age, socio-economic status and ethnicity; univariable analyses, Supplementary Table 5), the odds of isolating GBS at the urban site (OR = 0.95 (0.92–0.98)) and rural site (OR = 0.91 (0.88–0.94)) were lower than at the semi-rural site (P < 0.001; Table 1).

Table 1 Exposures associated with maternal GBS colonization.

GBS colonization was independently associated with maternal age, highest in the middle categories (Supplementary Fig. 2; P = 0.023). It was also associated with parity (≥5 versus 1–4) (OR = 0.81 (0.70–0.93), P < 0.001), as well as Mijikenda ethnicity (indigenous population, OR = 0.73 (0.59–0.90), P = 0.003) (Table 1). GBS colonization was increased in women with higher socio-economic status (OR = 1.21 (1.13–1.29), P < 0.001) and those who had contact with cattle (OR = 1.29 (1.17–1.43), P < 0.001). GBS colonization was reduced among HIV-infected women and especially in HIV-infected women taking co-trimoxazole prophylaxis (OR = 0.68 (0.42–1.09); OR = 0.24 (0.14–0.39), P < 0.001), in less well-nourished mothers (OR = 0.72 (0.60–0.88), P < 0.001) and in women with obstetric emergencies (OR = 0.85 (0.79–0.92), P < 0.001).

There was evidence that adverse perinatal outcomes (very preterm delivery, very low birthweight, stillbirth, possible serious bacterial infection; for definitions see Supplementary Table 1) were associated with maternal GBS colonization in multivariable models in the context of interactions with clinical risk factors for invasive GBS disease, such as maternal temperature >37.5 °C, urinary tract infection and prolonged rupture of membranes >18 h (Fig. 2 and Supplementary Tables 6–9). In contrast, without GBS colonization there was no evidence that these clinical factors conferred elevated risk of poor outcomes. There was no evidence of association of maternal GBS colonization with perinatal mortality (P = 0.7; Supplementary Table 10), including testing for an interaction with any risk factor for GBS disease (P = 0.4).

Figure 2: Interaction of risk factors at delivery with maternal GBS colonization associated with adverse newborn outcomes.
figure 2

Interactions between maternal risk factors at delivery (maternal fever, maternal urinary tract infection, prolonged rupture of membranes) and adverse perinatal outcomes (very preterm birth, very low birthweight, stillbirth, possible serious bacterial infection), in the presence and absence of maternal GBS colonization. Odds ratios are given for maternal exposures and associated perinatal outcome (listed vertically) with 95% confidence intervals illustrated with error bars for the odds ratio in each case. Interactions were included in multivariable models if there was evidence of interaction at the P < 0.1 level in univariable analyses. P values given here are for interaction tests in imputed multivariable models (details for all models are provided in Supplementary Tables 5–9). **Possible serious bacterial infection (pSBI) is defined in Supplementary Table 1; this is a clinical diagnosis used to guide empiric treatment of neonates for possible serious bacterial infections in resource-poor settings.

Of 918/934 (98.3%) available and extracted colonizing isolates, 915/934 (98.0%) were of sufficient quality for genomic analysis. Among colonized mothers, 658/915 (71.9%) GBS isolates were serotypes Ia, Ib or III, with serotype III being most common (350/915 (38.3%)). Clonal-complex 17 (CC17) was identified in 267/915 (29.2%) of GBS-colonized women (Figs 3 and 4 and Supplementary Table 11). Of these, 265/267 (99.3%) were serotype III and 2/267 (0.7%) were serotype IV.

Figure 3: GBS types colonizing mothers and causing disease.
figure 3

a, Invasive neonatal GBS disease cases decrease after the first few days of birth in KCH neonatal admissions (1998–2013) and serotype III causes an increasing proportion of disease. b, The clinical infection syndrome is predominantly sepsis in the first few days after birth in neonates admitted with invasive GBS disease to KCH (1998–2013), with increasing numbers of neonates admitted with meningitis with or without sepsis later in the neonatal period. c, Percentage of different serotypes in GBS isolates from maternal colonization, early onset disease (EOD) and late onset disease (LOD) in neonates shows a stepwise increase in serotype III from maternal colonization to EOD and LOD. d, Percentage of different clonal complexes in GBS isolates from maternal colonization, neonatal sepsis and neonatal meningitis (+/− sepsis) shows the increasing dominance of CC17 in neonatal disease, particularly in neonatal meningitis.

Figure 4: Phylogenetic reconstructions of GBS isolates.
figure 4

Maximum likelihood phylogenies, with recombinant regions removed, are shown separately for each clonal complex. Background shading indicates ST17 isolates within CC17. Serotypes are illustrated for each clonal complex in the innermost circle. The next circle describes the sample source of the GBS isolate (neonatal invasive or maternal colonizing, by site of recruitment). For maternal colonizing isolates, epidemiological details are presented. From the outermost circle these are maternal HIV status (negative, HIV-infected, HIV infected and taking prophylactic co-trimoxazole), socio-economic status (high, medium, low and very low), ethnicity (Mijikenda or non-Mijikenda) and the presence or absence of cattle contact.

The population structure was broadly similar to that in other parts of the world, with 114/915 (12.5%) CC1, 148/915 (16.2%) CC10, 268/915 (29.3%) CC17, 173/915 (18.9%) CC19, 208/915 (22.7%) CC23, with 4/915 (0.4%) not belonging to any commonly described clonal complex. No bovine-associated CC67 (ref. 38) GBS isolates were identified. Each of the five major clonal complexes was represented at each site (Fig. 4 and Supplementary Table 12), with no evidence for geographic stratification. Within the clonal complexes, there was considerable diversity, with a total of 43 distinct STs, 18 of which were newly identified in this study. The largest number of STs was seen in CC17 (12 STs in total, with 8 newly identified). The most common STs within CC17 were ST17 (183/268, 68.3%) and ST484 (67/268, 25.0%).

Within GBS-colonized women, risk factors for colonization with the most virulent clone, CC17, were, in general, the reverse of those associated with GBS colonization overall (Table 2). Maternal GBS CC17 was increased in rural sites (OR = 1.26 (1.20–1.31), P < 0.001), in women of Mijikenda ethnicity (OR = 1.62 (1.43–1.85), P < 0.001) and in women with HIV infection and women with HIV infection taking co-trimoxazole (OR = 1.46 (1.11–1.92) and OR = 4.30 (0.59–31.3), P < 0.001). Mothers who had contact with cattle (OR = 0.54 (0.45–0.64), P < 0.001) and were better nourished (OR = 0.79 (0.42–1.49), P < 0.001) were less frequently colonized with CC17, but this did not hold for ST17 (Supplementary Table 13). For each of the risk factors, including cattle contact, the corresponding isolates were dispersed in the phylogeny (Fig. 4), suggesting that the associations were not driven by specific sublineages.

Table 2 Exposures associated with maternal GBS colonization with CC17.

Pairwise comparison of all maternal colonizing isolates in mothers delivering at Kilifi County Hospital (KCH) showed increased genetic similarity in a small number of mothers who delivered within seven days of one another, but not according to household location (Supplementary Fig. 3). Of mothers admitted fewer than seven days apart in KCH, there were 14/1,013 (1.4%) pairs from mothers admitted on the same day with 0–4 single nucleotide variant (SNV) differences, 11/1,967 (0.6%) one day apart, 2/1,845 (0.1%) two days apart and 2/1,832 (0.1%) six days apart (P < 0.001). At rural sites, among mothers admitted fewer than seven days apart, there were 2/124 (1.6%) pairs from mothers admitted on the same day with 0–4 SNV differences and 2/219 (0.9%) from those admitted one day apart (P = 0.1). At the urban site, there were 8/987 (0.8%) pairs from mothers admitted on the same day with 0–4 SNV differences and 3/1,555 (0.2%) from those admitted one day apart (P < 0.001)22.

GBS in mother–neonatal pairs (surface contamination)

We recruited 830 mother and baby pairs at KCH (Fig. 1 and Supplementary Table 14). Of these, 104/830 (12.5% (10.4–15.0%)) mothers were colonized with GBS at delivery, and 44/830 (5.3% (3.9–7.1%)) neonates had GBS isolated from ear, umbilicus or nose within 6 h of delivery. A total of 30/44 (68.2%) neonates with surface GBS were born to one of the 104 GBS-colonized mothers, and 14/44 (31.8%) were born to one of the 726 mothers without colonizing GBS detected (of these, 2/14 (14.3%) were born by caesarean section). Odds of neonatal surface GBS were high with maternal GBS colonization (OR = 20.6 (10.5–40.6), P < 0.001).

Pairwise SNV comparisons between maternal and newborn isolates showed a clear bimodal distribution: 26/30 (86.7%) pairs differed by ≤4 SNVs (all pairs the same ST and serotype), presumably representing vertical transmission, and 4/30 (13.3%) pairs were highly divergent (>9,000 SNVs, with different STs and different serotypes) (Fig. 4). Combining all pairs with ≤4 SNVs, the SNVs were dispersed throughout the genome, with no gene represented more than once. There were 7/44 (15.9%) neonates with surface GBS after delivery by caesarean section (5 of their mothers had GBS detected; 3/5 had 0 SNV differences, 1/5 had 1 SNV and 1/5 had 9,673 SNVs).

Stillbirth

There were 278 stillbirths during the nested case-control study (278/4,394 (6.3%) all births). We sampled cord blood in 149/278 (53.6%; 94/149 (63.1%) intra-partum, 55/149 (36.9%) ante-partum stillbirths), 104 also had a lung aspirate, and 34/278 (12.2%) had a lung aspirate sample only. In total, 183/278 (65.8%) stillbirths were sampled, plus 330 live-birth cord blood controls (Fig. 1).

GBS was isolated from 4/183 (2.2% (95%CI 0.6–5.5)) stillbirths (3/149 cord blood samples, 2/138 lung aspirates; one stillbirth had GBS isolated from both), two ante-partum (36 and 39 weeks gestation) and two intra-partum (35 and 39 weeks). The overall minimum incidence of GBS-associated stillbirth (cord blood or lung aspirate) was 0.91 (0.3–2.3)/1,000 births. Compared to live-born controls (GBS isolated from 1/330 (0.3%)), GBS was isolated more frequently from cord blood in stillbirths (OR = 6.8 (0.7–65.5), P = 0.09) and, in a multinomial model, ante-partum stillbirths (OR = 12.4 (1.1–139.3)) and intra-partum stillbirths (OR = 3.5 (0.2–57.1) exact P = 0.055). Serotype data were available from three stillbirths (two were serotype V and one serotype III).

There were 2/4 GBS-associated stillbirths born to GBS colonized mothers (2/2 pairs differed by 0 SNVs, all ST1, serotype V). One mother was not colonized, and one was not tested. The risk ratio for GBS-associated stillbirth in GBS-colonized versus non-colonized mothers was 7.6 (1.1–52.6, P = 0.016).

Neonatal disease

Eighty-two neonates with invasive GBS disease were admitted to KCH (1998–2013, Fig. 1), of which 36/82 (43.9%) and 43/82 (52.4%) were associated with EOD and LOD, respectively (three unknown). Case fatality was highest in EOD (17/36 (47.2%)), despite treatment, particularly for those diagnosed within <24 h of birth (11/18 (61.1%)). In patients with LOD, 5/43 (11.6%) died. Most GBS EOD cases (52/82 (63.4%)) were male and 25/82 (30.5%) weighed <2,500 g at admission (Supplementary Table 15). Sepsis without focus was predominant in EOD (33/36 (91.6%)), with meningitis (+/− sepsis) being more common in LOD (21/43 (48.8%)) (Fig. 3). Gestational age was not routinely available from previous clinical surveillance data, but there were five EOD cases with gestations of 36, 36, 37, 37 and 40 weeks born at the time of the prospective cohort study (versus a median of 38 (interquartile range (IQR) 36–40) overall in the prospective cohort).

EOD incidence among deliveries at KCH during the cohort study (2011–2013) was 0.76 (0.25–1.77)/1,000 live births. Including only residents in the Kilifi Health and Demographic Surveillance System (KHDSS) population (1998–2013), the (minimum) population-based incidence of neonatal GBS disease was 0.34 (0.24–0.46)/1,000 live births: EOD in 0.13 (0.07–0.21)/1,000 live births and LOD in 0.21 (0.14–0.31)/1,000 live births, with no evidence of a trend over the study period (Supplementary Fig. 4).

There were 73/82 (89.0%) neonates with available invasive isolates that were extracted, and all were of sufficient quality for inclusion in the final analysis. Serotypes Ia/Ib/III caused 71/73 (97.3%) of EOD and LOD, and serotypes Ia/Ib/II/III caused 72/73 (98.6%) of cases. Serotype III predominated in both EOD (18/30 (60.0%)) and LOD (36/40 (90.0%); P = 0.003, χ2 test for trend). These isolates were all CC17, except one CC19 isolate (Fig. 4). Serotype III was the almost universal cause of meningitis (22/23 (95.7%) cases), of which 21/22 (95.4%) were CC17 (Fig. 3 and Supplementary Table 16). Isolates were all susceptible to penicillin and 61/76 (80.3%) were susceptible to co-trimoxazole.

Three of the five neonates with EOD born at KCH (2011–2013) were born to GBS-colonized mothers (1/3 pairs differed by 0 SNVs (both ST17, serotype III), 1/3 pairs by 88 SNVs (one ST17, one ST484, both serotype III) and 1/3 pairs by 1,002 SNVs (both ST17, serotype III), with a risk ratio (RR) for EOD for GBS-colonized versus non-colonized mothers of 11.8 (2.0–70.3), P < 0.001). For all perinatal GBS disease (EOD or stillbirth), RR = 13.1 (3.1–54.8, P < 0.001).

Discussion

GBS is an important cause of stillbirth and neonatal disease in Kenya. The incidence of stillbirth was comparable to EOD in hospital births (0.91 (0.25–2.3)/1,000 births and 0.76 (0.25–1.77)/1,000 live births, respectively). These incidences are all underestimates, with samples not taken from all stillbirths and insensitivity in cultures, particularly if intrapartum antibiotics were given. The much lower population-based incidence of EOD (0.13 (0.07–0.21)/1,000 live births) suggests recruitment bias with under ascertainment of cases in the community, or in outpatient settings, due to rapid case fatality after delivery and limited access to care. This is supported by the higher proportion of LOD, which is the reverse of the ratio of GBS disease typically seen in high-income countries23. Although it could be argued that facility delivery is a risk factor for EOD (if there was in-hospital maternal GBS acquisition), we found very limited evidence of horizontal transmission in facilities, with few genetically near-identical pairs (0–4 SNVs, threshold determined empirically from a newborn surface contamination study) in mothers admitted fewer than seven days of each other.

However, there may be true differences in the incidences of both GBS-associated stillbirth and neonatal GBS disease in sSA, which are neither explained by the study design nor by other methodological limitations. The incidences of neonatal GBS disease recently reported in urban South Africa29 and Malawi23 are high and could be due to differences in maternal GBS colonization prevalence, consistent with our finding of a higher prevalence of maternal GBS colonization in urban compared to semi-rural and rural residents. This association was explained by variables describing an improved socio-economic status and other factors associated with improved health, such as better nutritional status, being in middle age categories and a lower parity, both in the complete-case analyses and using multiple imputation. Although our study includes impoverished populations, the pattern of risk factors identified is consistent with recent studies in high-income countries reporting increased maternal GBS colonization with higher education14,16 and higher income16. The reasons for this are unclear, but are probably related to changes in the maternal microbiome, with different community states reported39.

The use of prophylactic co-trimoxazole among HIV-infected women had a clear negative association with GBS colonization. Previously reported conflicting findings17,18 may depend on the frequency of antimicrobial use (and provision of anti-retroviral therapy). In contrast, neonatal GBS disease is increased with HIV exposure40, with reduced maternal GBS capsular antibody in HIV-1 infection41,42 and/or because, as shown here, the most virulent clone, CC17, is more frequently found in HIV-infected GBS colonized women, compared to other non-CC17 types. A number of virulence factors (adhesins, invasins and immune evasins) have been associated with an increased ability of GBS to colonize and cause disease43, with the more homogeneous CC17 having acquired its own set of virulence genes38 and increased ability to form biofilms in acidic conditions44.

We observed an association between cattle contact and maternal GBS colonization, but no bovine-associated CC67 isolates were identified, and the isolates from women with cattle contact were from a variety of lineages representing all major CCs. Little is known about bovine GBS populations in Kenya, and it is possible that the human and bovine populations are similar, thus leading to the association between cattle contact and maternal GBS colonization from genuine transmission, as suggested elsewhere45. Alternatively, women who look after cattle may be of a higher socio-economic status, and thus the association could be due to residual confounding.

The overall GBS population structure outlined here is similar to that found in previous studies from a variety of geographic locations, supporting the notion of recent global dissemination of relatively few clones32. Within this study, we found no evidence for geographic clustering of related isolates, both at the level of sampling location (Fig. 4) and distance between households (Supplementary Fig. 3), further suggesting the rapid geographic dispersal of GBS. However, in contrast to a previous study from Africa35, we found no CC26 isolates, suggesting that this lineage may be geographically restricted. Furthermore, we found a large number of ST484 isolates, 67/915 (7.3%) of the total and 67/268 (25.0%) of CC17. This lineage has previously been reported in only a single study, also from Kenya46. We also identified three novel STs that represent single-locus variants of ST484. Taken together, it is possible that ST484 originated in or near Kenya, with relatively little geographic dispersal. Alternatively, there may be a lack of GBS sampling in other locations where ST484 is present.

Prevention strategies in resource-rich settings focus on reducing EOD through IAP using either microbiological or risk-factor screening to identify at-risk mothers7. Both strategies would be challenging in resource-poor settings. Of interest, when comparing these strategies, however, is the fact that associations with adverse perinatal outcomes were only detected through interactions between maternal GBS colonization and clinical risk factors. This supports a mechanism of action whereby colonizing maternal GBS ascends, leading to chorioamnionitis (intra-amniotic infection) and fever in a small proportion of women, leading to poor perinatal outcomes. Neither maternal GBS colonization without signs of infection nor maternal fever without GBS colonization increased the risk of adverse perinatal outcomes. Thus, either approach (microbiological or risk-factor screening) will target far larger numbers than those actually at risk. Any direct association between maternal GBS colonization and adverse outcomes may also be diluted by the many other causes of adverse perinatal outcomes and by misclassification (for example, uncertainty over the date of the last menstrual period to determine gestation), which may explain some of the conflicts in findings of studies assessing the contribution of GBS to preterm birth4.

We demonstrated the vertical transmission of maternal GBS colonization in maternal–newborn dyads, for both surface contamination (including cases of emergency caesarean section) and perinatal disease. Genetically divergent maternal–newborn dyads may reflect unsampled variation in the mother, as only a single colony was sequenced in each case. Although adaptive mutations associated with disease progression have been reported elsewhere from a comparison of mother–newborn pairs47, we were unable to find evidence for this in the current study, as all pairs involving invasive isolates were either genetically identical (0 SNVs) or divergent enough to argue against this. The findings show that GBS infection occurs before delivery, supporting the need for IAP to be administered before delivery to be effective and showing why antisepsis in active labour (for example, by using vaginal chlorhexidine wipes) is ineffective in reducing neonatal EOD8. The finding of 14/44 (31.8%) newborns with surface GBS contamination where maternal GBS colonization was not identified suggests insensitivity of maternal recto–vaginal screening, despite the consistent use of broth-enrichment and blood agar to maximize sensitivity. This is a higher percentage than found in a recent study in The Gambia (40/186 (21.5%))48, but this study excluded mothers at high risk for pregnancy complications. As is found with repeat vaginal examinations, as seen here and reported elsewhere49, complicated deliveries (obstetric emergencies) probably decrease GBS sampling sensitivity, through antisepsis measures or mechanical removal.

With limitations in the clinical benefit of IAP in terms of reducing stillbirth and LOD, as well as challenges in its effective implementation to reduce EOD in sSA, maternal vaccination is an attractive strategy for prevention. The most advanced vaccine (which has completed phase 2 trials) is trivalent (Ia/Ib/III), but plans exist to advance a pentavalent vaccine10. If this includes the most common disease-causing serotypes worldwide (Ia/Ib/II/III/V), it will cover almost all (72/73 (98.7%)) of the serotypes causing invasive disease in this study. However, importantly for vaccine development, and in line with other reports50, we identified capsular switching to serotype IV in two isolates within CC17, suggesting that consideration of the inclusion of serotype IV is warranted.

GBS is an important, potentially preventable, cause of stillbirth and neonatal death in coastal Kenya. Maternal GBS colonization increases with urbanization and higher socio-economic status, and is likely to increase with development. GBS neonatal disease in population-based studies is markedly under-ascertained as a result of rapid case fatality after birth and limited access to care. The burden of early neonatal disease is likely equalled by the burden of GBS-associated stillbirth. Maternal GBS vaccination is a key opportunity to reduce stillbirth and neonatal death in this high-burden region.

Methods

Study design

The study design included a prospective cohort from rural, semi-rural and urban sites, a nested case–control study in the semi-rural site, and analysis of the surveillance of neonatal disease at the semi-rural site (Fig. 1).

Prospective cohort study

In a prospective cohort study (2011–2013), we assessed the prevalence and risk factors for maternal GBS colonization at delivery, and the perinatal outcomes at delivery (stillbirth, gestational age, birthweight, possible serious bacterial infection and perinatal death).

Nested case–control study

The investigation of stillbirth was undertaken with a nested case–control study. Cord blood cultures were taken at delivery from the stillbirth and from the next two subsequent admissions that were live born (case:controls = 1:2). Lung aspirates were taken from stillbirths only, by a study clinician attending within 4 h of the stillbirth.

Surveillance of neonatal invasive bacterial disease

Neonatal disease was quantified using systematic clinical and microbiological surveillance data (1998–2013 at KCH) within the KHDSS area, giving accurate population and birth denominators (see ‘Study sites’)51.

Study sites

The studies were conducted at Coast Provincial General Hospital, Mombasa (CPGH; urban location, 12,000 deliveries per year, comprehensive obstetric care); at Kilifi County Hospital (KCH; semi-rural, 3,000 deliveries per year, comprehensive obstetric care); at Bamba sub-district hospital (rural, 600 deliveries a year, basic obstetric care) and Ganze health facility (rural, 400 deliveries a year, basic obstetric care).

Part of Kilifi County is included in detailed health and demographic surveillance (KHDSS)51, from which accurate population data are available from 2004. KCH is the main district hospital serving this population, so incidence estimates for residents seeking healthcare at KCH can be made with the KHDSS population as the denominator. We used prospectively collected data on live births from the regular re-enumerations of the KHDSS population, and used the estimated slope from a regression to estimate the number of births before the start of KHDSS.

Study population

Prospective cohort study

We included all women admitted for delivery at study sites at designated times who gave written informed consent, without additional exclusion criteria. We planned to recruit over one calendar year (to allow for seasonality), but extended the enrolment to meet sample size requirements (Supplementary Table 3) because national strikes closed government health facilities twice during the study. Recruitment was performed at CPGH for 48 h each week (1 April 2012 to 31 July 2013), at Bamba and Ganze for six days each week (1 July 2012 to 31 July 2013) and at KCH every day (1 August 2011 to 31 July 2013), including additional studies of neonatal surface contamination (1 May 2012 to 31 July 2013).

Nested case–control study

We included all stillbirths delivered in KCH and the next two consecutive live births (1 May 2012 to 1 October 2013).

Surveillance of neonatal invasive bacterial disease

We included all neonates admitted to KCH (1 August 1998 to 1 October 2013).

Sampling and laboratory methods

Prospective cohort study

We took recto–vaginal swabs during routine vaginal examination at admission for delivery, when possible before rupture of membranes. A small cotton swab was used to wipe the lower third of the vaginal mucosa and then the inside surface mucosa of the anus52, according to standard procedures. Neonatal surface swabs (to assess surface contamination) included the external ear, nares and umbilicus. Swabs were placed into Amies transport medium with charcoal53, refrigerated, transported in cool containers53 to the research laboratory (participating in the UK National External Quality Assessment Service) and processed by standard protocols (including enrichment (Lim broth) and subculture onto blood agar). Isolates with GBS morphology were Christie Atkins Munch-Petersen (CAMP) tested and definitive grouping done using a Streptococcal grouping latex agglutination kit (PRO-LAB Diagnostics).

Nested case–control study

For stillbirths and live-born controls, we sampled cord blood at delivery after double clamping the cord if necessary and cleaning with 70% ethanol. We processed cord blood cultures using an automated culture system (BACTEC 9050, Becton Dickinson). Lung aspirate samples (stillbirths only) were taken with a sterile technique by aspirating the lung within 4 h of delivery. We examined lung aspirates with microscopy and cultured using standard methods within 30 min of sampling, or if a delay was unavoidable they were stored at 2–8 °C for up to 8 h.

Surveillance of neonatal invasive bacterial disease

For all neonatal admissions (1998–2013) at KCH, peripheral blood was sampled on admission for culture, before neonatal antibiotic treatment (during 2011–2013, peri-partum maternal antibiotics were documented in 36/5,430 (0.7%) of deliveries in KCH), and we carried out lumbar puncture when clinically indicated. We tested the isolates for antimicrobial susceptibility to penicillin and co-trimoxazole (British Society for Antimicrobial Chemotherapy). Blood cultures were processed using an automated culture system (BACTEC 9050). Cerebrospinal fluid was tested as described elsewhere26.

Molecular methods

DNA extraction, Illumina sequencing (Hiseq Technology) and raw read processing were carried out using standard methods starting from a single GBS colony. GBS isolates were frozen in 1 ml vials and stored at –80 °C before subculture on a Columbia blood agar plate for 24–48 h, followed by DNA extraction from a single colony using a commercial kit (QuickGene, Fujifilm). High-throughput sequencing was undertaken at the Wellcome Trust Centre for Human Genetics (Oxford University) using HiSeq2500, generating 150 base-paired end reads. De novo assembly, mapping and variant calling were performed as described previously54, except that mapping was to S. agalactiae reference genome 2603 V/R (NC_004116.1). Sequence quality was assessed using various metrics (per cent reads mapped to the reference genome, per cent reference positions called, contig number, total contig length). Sequence data showing poor quality metrics were excluded from further analysis, and where practicable the corresponding samples were re-isolated, re-grouped and re-sequenced (if re-grouping confirmed the isolate as GBS).

We allocated serotypes on the basis of BLASTn comparisons by assessing the sequence similarity of de novo assemblies with the capsular locus regions of each of the ten known GBS serotypes. We validated this method internally (kappa = 0.92)55. STs were also assigned in silico using BLASTn with de novo assemblies. Novel STs were submitted to pubmlst.org for assignment. Phylogenetic analysis was performed separately for each clonal complex using RAxML version 8.1.16, with an alignment consisting of all variable sites from mapping to the 2603 V/R reference, padded to the length of the reference with invariant sites of the same GC content as the original data. Recombination was detected using ClonalFrameML56 and we present the resultant phylogenies with recombinant regions removed. To partition the isolates according to previously described clonal complexes, we first reconstructed a single RAxML phylogeny with all isolates. The resulting tree was then visually partitioned on long, deep branches, which effectively corresponded to previously described clonal complexes, but enabled us to include all STs. We therefore used this partitioning as our definition of the clonal complexes. Using this definition, each ST belongs to a single clonal complex and each clonal complex is monophyletic (Supplementary Fig. 6), indicating that partitioning by clonal complex remains appropriate when whole-genome data are taken into account.

Pairwise comparison of SNV differences from mapped data was used to examine maternal and newborn paired GBS isolates, and possible transmission of GBS between mothers was investigated via these differences and epidemiological links in time and place (through delivery in KCH) or residence (distance between household locations in KHDSS).

Statistical analysis

We used Stata (version 13.1) for statistical analyses. We used the first principal component from a set of household assets as a proxy for socio-economic status (SES)57. Multiple imputation with chained equations (Stata mi) was used to impute missing data on potential risk factors (<15% per variable; 50 imputations). Continuous variables were checked for normality and transformation was not required. We used natural cubic splines to allow for nonlinearity in variable effects in imputation models. Imputations were done separately by maternal GBS status so that interactions could be examined in the analyses of adverse newborn outcomes. The same imputation was used for both analyses; by imputing separately for GBS colonization there are fewer assumptions than if it was fitted as a covariate (this allows variances of continuous imputed variables to differ according to GBS colonization and the associations between two imputed variables can be stronger in one group).

We built multivariable logistic regression models using complete-case and imputed data sets (combined using Rubin's rules) to examine risk factors for maternal GBS colonization using robust variances reflecting clustering by site. We included nonlinearity in continuous variables via natural cubic splines, with factors categorized at quartiles for the presentation of final models. Risk factors with P < 0.1 in univariable models were included in a multivariable model, and final independent predictors were identified using backwards elimination (exit P > 0.1). We assessed whether risk factors for maternal GBS colonization were associated with ST17 and CC17 colonization in mothers who were GBS colonized using the same process, for complete cases only.

We used the imputed data set in multivariable regression analyses to examine whether maternal GBS colonization was associated with gestational length, birthweight, possible serious bacterial infection, stillbirth or perinatal mortality. We included pre-specified confounders (age, parity, sex (of newborn), maternal education, SES, nutritional status, HIV status, obstetric complication and multiple delivery) and tested for interaction with GBS colonization from prolonged rupture of membranes (PROM, >18 h), maternal fever (>37.5 °C) or urinary tract infection (leukocytes and nitrites present). We included these terms in multivariable models if there was evidence of interaction at the P < 0.1 level.

We estimated the odds of isolating GBS from cord blood in all stillbirths, then ante-partum and intra-partum stillbirths, compared to live births. We estimated the incidence of GBS-associated stillbirth and neonatal disease using denominators of facility births and community births, for residents of KHDSS51.

Ethics

The study protocol was approved by KEMRI Ethical Review Committee (SSC/ERC 2030) and the Oxford Tropical Research Ethics Committee (53-11) (clinicaltrials.gov NCT01757041).

Accession codes

Sequence data have been submitted to the NCBI Sequence Read Archive under BioProject PRJNA315969. Individual accession numbers are provided in Supplementary Table 17 (BioProject PRJNA315969).