The sample size is calculated on the primary outcome. Using data extracted from questions in the previous qualitative study, sample size for an unclustered study with a 5 % two sided Type 1 error and 80 % power to detect a 25 % difference in mean questionnaire score, would be 114 women in each arm. Both Daly et al. in USA and Srikanth et al. in India found a very low level of knowledge about causes and risk factors for otitis media in mothers of young children [
37,
40]. We anticipate a similarly low level in Jumla, and hypothesise that a 25 % increase would be modest and achievable. The cluster sizes are set at the size of the women’s groups which is around 20 women. To adjust for clustering, it is necessary either to establish an approximate intra-cluster correlation coefficient (ICC) or ρ, which measures the similarity of individual responses within clusters, in order to calculate the design effect (the number by which an individual trial must be multiplied to adjust for clustering) or to calculate the between cluster variability, k. These values vary and any estimate can only be approximate. Other studies on neonatal mortality in women’s groups in Nepal have found a very low ICC of 0.00644 [
24]. Tielsch et al. [
41] calculated a design effect of 1.23 with their study of supplements to prevent malnutrition in children. Mullany et al. [
22] used a design effect of 2.0 for their study looking for a 25 % reduction in omphalitis in newborns in rural Nepal. Slightly further afield in Bangladesh, Aboud et al. [
23] calculated an ICC of 0.03 for a responsive feeding and stimulation intervention to clusters of rural women and children. Pagel et al. [
42] reviewed ICC for interventions around perinatal outcomes in low resource countries and found universally low ICC for maternal and neonatal mortality but higher ICC for other interventions such as skilled birth attendance. For example, in India and Bangladesh skilled birth attendance ranged from ICC 0.02-0.04. The Design Effect (DEff) is the number by which the sample size for an individual randomised controlled trial must be multiplied to give the equivalent power to a cluster randomised trial. The Design Effect also considers the number of individuals in each cluster
– m. Using the equation DEff = 1 + (
m-1) ρ, assuming ρ = 0.03, would give a DEff = 1.6, or 182 women per arm. Using the safe equation DEff = 1 + (m-1) ρ, assuming ρ = 0.05, would give a DEff = 1.95, or 223 women per arm. This would translate into 11 clusters per arm. To account for stratifying, we have added a conservative extra two clusters per arm, which would make 13 clusters per arm. To enable robust cluster analysis a minimum of 15 clusters is ideal. This will provide a generous total sample of approximately 600 women in 30 clusters, 15 in the intervention and 15 in the control arm.