Background
Globally, the number of people living with HIV and AIDS has been increasing, while the number of new infections and AIDS related deaths has declined slightly [
1]. Sub-Saharan Africa, where 67% of the global 33.4 million people living with HIV and AIDS reside, continues to be the worst affected region of the world. In South Africa, more than five million people are living with HIV and AIDS [
2]. The overall HIV prevalence in South Africa was estimated at 11.4%, 10.8% and 10.9% of the total population excluding infants under 2 years of age in 2001, 2004 and 2008 respectively [
3,
4]. There is evidence that condoms are highly effective in reducing sexual transmission of HIV [
5]. While they are also the most widely available prevention means they are currently not used to their full potential as a low cost prevention technology.
Recognizing the multifaceted nature of behavioural outcomes [
6,
7], studies of condom use have considered various levels of influencing factors. Most of the existing, empirically validated AIDS behavioural theories share some overlapping psychological constructs [
8]including cognitive factors, beliefs and attitudes towards condoms, as in the Theory of Reasoned Action [
9], skills needed to use condoms effectively, as in the Information-Motivation-Behavioural Skills model [
10], and social norms as in Social Cognitive Theory [
11]. At an intrapersonal level, educational aspirations and students’ performance [
12], ability to plan and prepare for condom-use [
14], personal coping strategies including alcohol use [
14], personality traits such as sensation seeking and impulsivity [
15] have been found to be related to condom use. At an interpersonal level relationship variables (partner type) [
16], parent–child communication and parental supervision [
17] have been shown to be associated with condom use. At an environmental level, workplace or school related peer-pressure and broader contextual factors such as cultural norms or policies also affect condom use. In South Africa, for instance, the interplay between socio-economic factors, service costs, condom availability, condom knowledge and its access and tobacco and alcohol use, were all found to predict demand for condoms [
18]. Further, gender disparities in socio-economic status were found to influence women’s ability to negotiate condom use [
19,
20].
In spite of a recent increase in condom use of about 65%, estimates of HIV incidence in South Africa remain between 1% and 2% (new infections per year) among young people aged 15–20 years [
21]. We therefore need a better understanding of factors that hinder and facilitate condom use in order to reduce HIV incidence in the future. The analysis should also explore the reasons why people refuse condom use in spite of prevention campaigns and high levels of knowledge. Studies trying to explain ‘condom refusal’ rather than condom use have been limited to exploring its relation to HIV risk and gender violence [
22].
The aim of this study was to investigate the extent and predictors of condom use and condom refusal in the Free State province. We investigate the question as to whether or not the predictors are the same or different for these two outcome variables, by employing and comparing two different and complementary statistical models. Exploring condom refusal in more detail may lead to the development of more nuanced prevention messages and may be used to inform region-specific and nationwide prevention policies.
Methods
Study setting
The Free State with nearly 3 million inhabitants is one of the nine provinces in South Africa with a high HIV prevalence, estimated at 12.6% among the population above two years of age in the 2005 and 2008 South African National HIV Prevalence Surveys. As in other provinces, awareness campaigns and condom promotion programs were introduced in 1995 and reported condom use increased from 35% in 2002 to 65% in 2008.
Sampling method
Data used in this paper were from a cross-sectional study conducted during the first half of 2009 in the Free State province of South Africa, commissioned by the Provincial Department of Health. The Free State is divided geographically into 20 Local Municipalities (LM). A cluster sampling method was used with LM as the primary sampling unit. Each LM was divided into enumeration areas based on the 2001 census [
23]. Thirty enumeration areas were selected using a stratified random sample with probability proportional to size. Ten households were selected sequentially starting at a household identified randomly in each sector. In each household only one participant above the age of 17 years was interviewed after his/her written consent was obtained. If there was only one participant in the household, he/she is interviewed. If there were more than one, one respondent was selected randomly. The interviews were conducted by research assistants recruited and trained in interview techniques and in how to fill out the questionnaire. The household interviews were carried out face-to-face in the participant’s language, which was usually the mother tongue of the interviewee. The sample size achieved during the data collection ranged from 273 to 310 per LM and yielded a total sample size of 5,837 participants for the whole province.
Measurements
To assess predictors of condom use and condom refusal, a questionnaire was developed using Epi-Info version 3.5.1. The development of the questionnaire was guided by a simplified theoretical framework, considering predictors of condoms use shared by relevant behavioural theories such as knowledge, attitudes and beliefs, and socio-cultural norms. The questionnaire containing 110 mostly closed questions was translated from English into Northern Sotho and Afrikaans; the translations were tested in a pilot study and validated. In addition to the two main outcome variables, 19 predictor variables were included in the questionnaire based on the empirical evidence available on factors influencing condom use versus condom refusal. The predictor variables consisted of demographic information (age, gender, ethnic group, marital status, education, employment and type of residence, own and partner’s HIV status), intrapersonal variables (e.g. knowledge, attitudes towards condoms) and contextual variables such as availability, affordability and whether condoms were obtained from public or private sources. Other independent variables were related to health behaviour.
Knowledge of HIV and condoms
‘Knowledge of HIV’ was measured by four statements with ‘Agree’, ‘Disagree’ and ‘Don’t know’ responses concerning the curability of HIV, unsafe sex as its cause, treatment of opportunistic infections and benefit of knowing one’s HIV status. Since the assessment of HIV-related knowledge had to remain brief, statements were selected that typically cut across the many areas relevant to HIV-related knowledge. An incorrect or ‘don’t know’ response to any one of these statements was considered as inadequate knowledge.
‘Knowledge about condoms’ was assessed by four statements about the extent to which condoms to prevent sexually transmitted diseases and pregnancy, their free availability in hospitals and clinics, their correct and consistent use and awareness about the existence of female condoms. A wrong response to any of these statements was considered inadequate knowledge.
The belief that condoms can prevent HIV was assessed by the question “Do you believe that the use of condoms can prevent HIV?” The perceived need for condoms was assessed by asking the Yes/No question “Are you in need of a condom to prevent HIV nowadays?”
Socio-cultural norms
Stigma or shame related to HIV may also be expressed through socially and culturally grounded attitudes towards condoms. The degree to which participants’ felt that there was shame associated with condom (referred to as condom stigma) was evaluated by five ‘Yes/No’ questions. Participants were asked whether or not they were ashamed of using condoms, to purchase condoms, taking condoms from free distribution points and talking about condoms in general and with their partner. A ‘Yes’-response was considered to exhibit shame associated with condom, which in a social context may be related to ‘condom stigma’.
Contextual variables
The questions used to measure contextual variables were: “Are condoms available to you if you need one?” with ‘Yes/No’ response for ‘availability’, “Are condoms affordable to you if you want to buy one?” with ‘Yes/No’ responses for ‘affordability’ and “What is your usual source of condom?” with a ‘Free/Paid source’ response for ‘usual source of condom’.
Sexual risk behaviour
Sexual risk behaviour other than not using condoms was assessed using four questions: sexual debut before the age of 15, multiple concurrent sexual partners, frequent change of sexual partners and ever having been forced into sexual intercourse. Any one of these risky behaviours was considered sufficient to categorize a participant as having sexual risk behaviour.
Outcome variables
The analyses were conducted with two outcome variables: ‘ever used condom’ and ‘ever refused condom’.
Analysis
The analysis plan included two complementary statistical models in order to help explore the complexity and interplay between explanatory variables: multivariate logistic regressions and classification trees which is a non-parametric data-mining technique.
Univariate and multivariate logistic regressions were conducted with two dependent variables: ‘ever used condom’ and ‘ever refused condom’ using a survey logistic regression in Stata [
24], allowing for the design effects of clustering. In order to capture as much condom use as possible, the variable ‘ever used condom’ was used instead of ‘used condom during last sexual intercourse’ [
25].
Classification Trees (CT) were used to explore the influence of the specified predictors on the use or refusal of condoms. CT models are useful tools to explore the relationship between a desired outcome and its determinants [
26] and have been used in several disease contexts e.g. malaria [
27‐
29].
The building of a classification tree begins with a root (parent) node, containing the entire set of observations, and then through a process of yes/no questions, generates descendant nodes. Beginning with the first node, containing the complete sample, CT finds the best possible variable to split the node into two child nodes. In order to find a best variable, the software checks all possible splitting variables, as well as all possible values of the variable to be used to split the node, seeking to maximize the average “purity” of the two child nodes. In other words, the child nodes will be as homogeneous as possible with respect to the outcome variables (i.e. condom use and refusal). The splitting is repeated along the child nodes until a terminal node is reached.
In our study, CT may provide additional insights to those obtained from a logistic regression, for a number of reasons. Firstly, as CT works with (non-predefined) interactions in a flexible way it makes it possible to deal with a large number of explanatory variables, as is the case in this study. Standard regression analyses rapidly become unreliable when the dimensionality is very high while CT handles multiple interactions in a more flexible way based on decision trees. In a decision tree analysis, subgroups are obtained by splitting the entire data set by finding the best splitting variables, and they are considered new starting populations, resulting in a natural creation of interactions. For example, the total sample (starting node) may be split in two subgroups according to for instance, income. It is possible that other variables only play a subsidiary role and are thus used as further splitters, in the low income group. This means that an interaction between two variables may be detected in a natural and non-predetermined way as is the case in standard regression techniques.
A second advantage of CT is that it deals with multi-collinearity in an intuitively correct way. Two approximately collinear variables in a logistic regression model can influence their significance levels and may change their association with the outcome variable. In a decision tree the more important of two collinear variables will be selected but the improvement measure attributable to each variable in its role as a either a primary or a surrogate splitter is also computed. Surrogate variables closely mimic and predict the action of primary splitting variables. The values of all these improvements are summed over each node and totalled, and are then scaled relative to the best performing variable. The importance score measures a variable’s ability to perform either as a primary splitter or as a surrogate splitter. If one variable is not selected at several splits because it is the second most important variable each time it may not appear in the tree, but it will appear in the variable importance table, ranking the variables based on their contribution in the construction of the tree. In addition, a tree is comprehensible to a wide audience and results in a clear division of the original sample in groups of high and low risk making it useful to policy makers.
It is still the case that the construction of trees is sometimes unstable. The method of cross-validation can then be used, which consists of dividing the entire sample randomly into N (usually 10) sub-samples, stratified by the response variable. One sub-sample is then used as the test sample and the other N-1 (e.g., nine) are used to construct a large tree. The entire model-building procedure is repeated N times, with a different subset of the data reserved for use as the test dataset each time. Thus, N different models are produced, each one of which can be tested against an independent subset of the data.
The strength of a tree can be indicated by its sensitivity and its specificity. The sensitivity for condom use for example, is calculated to indicate how many of the users are classified as users, meaning that they fall in a terminal node with a proportion of users higher than the average use in the population. The specificity is calculated to indicate how many of the non-users are classified as non-users.
Using different analytical tools (i.e., parametric and non-parametric) can result in interesting insights. For example, in a classical logistic regression, linear combinations are the primary method of expressing the relationships between variables, while in classification trees this relationship does not need to be linear or additive. A classical regression may be more appropriate to quantify linear relationships. A further advantage of a classical regression is the probability level or confidence interval associated with the coefficients in the model. By using the results obtained through CT in a complementary way to those of the parametric models, we combine the strengths of the two methodologies.
Ethical approval
Ethical approval was obtained from the Research Ethics Committee of the Faculty of Health Sciences, University of Free State, Bloemfontein.(Address: The Chairperson: Ethics Committee, Faculty of health sciences, PO Box 339 (g40), Bloemfontein, South Africa).
Discussion
This study analysed the extent and the determinants of both condom use and condom refusal. The overall rates of condom use we found in this study among the Free State population - i.e., 61.3% used condom during last sexual intercourse - compare well with the findings (64.8%) of the 2008 National HIV Prevalence Survey. The geographic and socio-economic features of people who were more likely than others to use condoms were similar to those reported in other studies [
30].
The two different analytical techniques, logistic regressions and classification trees, used in this paper, make different assumptions about data and have different strengths. The logistic regression models are useful in determining factors that are associated with the response variable, in the whole population. On the other hand, the classification trees divide the sample in two according to a cut-off value and then further analyses these two sub-populations. Splitting a sample in two results in two specific subsamples and in each of the segments or sub-samples, determinants can play a different role than in the general, initial population. The results of the two logistic regressions and CT are therefore not always the same.
Results of both methods indicate that different underlying constructs may partially influence condom use and condom refusal. As shown, only the variables knowledge about correct condoms use and sexual risk behaviour were associated with the respective outcome variable (condom use versus condom refusal) in the multivariate model, whereas the CT suggested that perceived need for a condom and knowledge of correct condom use were most influential for condom use, compared to condom stigma and sexual risk behaviour for condom refusal. This indicates that contextual factors such as societal norms should be considered more important in explaining condom refusal, whereas individual factors account for actual condom use. This has implications for developing prevention messages.
The regression results indicate that this research like much research in Africa (and elsewhere) finds that condom use and refusal responds to many determinants at multiple levels. The non-parametric classification tree allows further exploring the complexity related to condom use and refusal. To our knowledge this is the first time that such a model is adopted for studying condom use.
The multivariate models reveals factors that predict the behavioural outcome in the whole population, while the CT help detecting segments in the population that have specific prevention needs. Segmenting populations supports decision makers in targeting their efforts to specific subgroups.
While we do not claim that one technique is superior over the other, the strength of this study lies in reporting results using two methodologies, thereby increasing the study’s rigour and achieving a more comprehensive assessment of condom use and condom refusal determinants.
The CT results for example show that in the specific subgroup that knows about condom use and lives as a couple, condoms are not available, condom use is 73% compared to a 90% use when condoms are available. Furthermore, when individuals do not know how to correctly use condoms this leads to lower use (39%) in especially the group older than 33.5 years of age and less in the younger group (88%).
The CT model also provides the relative importance of the variables. The following five variables had the highest discriminatory power in relation to condom use: perceived need for condom, knowledge about correct use of condom, availability, age and marital status. All these variables were also significant in the multivariate logistic regression, with the exception of “perceived need” for condoms, which was significant only in the univariate analysis.
The CT, at the first split, revealed indeed a strong difference in condom use between those with and without a perceived need for condoms. Of the respondents with a perceived need, 39% used condoms, while of the respondents without a perceived need, 90% used condoms. The univariate logistic regression models also indicated that a strong predictor of condom use was its perceived need. However, when controlling for other variables in the multivariate analysis this variable did not remain significant. This may indicate one of the characteristics of the CT, i.e., if a variable on its own explains a high degree of the variability, it will be used as a first split and appear as a strong univariate predictor. The multivariate analysis allows for assessing the effect, while controlling for other variables.
Using the CT results to assess the relative importance of a predictor, indicated that knowledge of correct condom use was shown to be the second strongest predictor. This knowledge may result in additional condom user confidence.
Our findings show that especially older people, and those who are married or living together do not use condoms, which concur with earlier research finding that couples with stable relations are less likely to use condoms [
31]. HIV transmission risk in such relations is dependent on knowing the disease status of both partners and strictly adhering to the ‘be faithful’ prevention strategy, which may be subject to false assumptions. HIV- and condom knowledge and belief in the ability of condoms to prevent HIV were non-significant in predicting condom use in the multivariate models and non-important in the CT. This corroborates the contention that knowledge, belief and attitude as such may not be sufficient to achieve behaviour change, calling for multi-level models integrating more comprehensive perspectives. Such a multilevel approach has been used in Kenya and Zambia for instance [
32,
33]. In Zambia, evidence showed that in addition to individual factors community-level factors can be important and that condom-promotion efforts should pay attention to community-level social norms, population trends, informal social relationships and interpersonal communication. Findings of the study in Kenya also support the relation seen in this study between condom use and age and marital status.
The CT also allows segmenting the population in terms of the condom refusal. It indicates that refusal is especially high when stigma is present; half of this group refusing the use of a condom, while one in three refused the use when stigma was not present. When this stigma is not present, refusal is higher in those with sexual risk behaviour and especially when knowledge of condoms is absent (42% with adequate knowledge versus 34% with low knowledge refused condoms). Such segmentations through interactions are a natural outcome of CT and complement the aforementioned multivariate models. It suggests that knowledge about condoms and sexual risk behaviour are important, but mainly in the group that does not report shame associated to condoms.
In the CT analysis the following top five variables were related to condom refusal: shame associated with condoms, sexual risk behaviour, knowledge about own HIV status, knowledge about condoms, and older age. Shame associated with condoms, sexual risk behaviour and knowledge about condoms as influencing factors were corroborated by the multivariate parametric logistic regression. Affordability of condoms did not turn out to be significantly related to condom refusal in the univariate model and the CT model. However, this variable is significant with an odds ratio greater than one in the multivariate analysis. This finding can be explained by a strong relation between affordability and availability. Controlling for variables such as availability in particular but also stigma and knowledge of correct use makes affordability significant. The significant effect of affordability may indicate that where there is considerable ambivalence about condoms (affordability but also availability) there will be more opportunities for refusal when condoms are affordable than when they are not affordable.
The importance of condom stigma for condom refusal may be explained by its association with HIV stigma. A body of literature shows that HIV-related stigma acts as a strong barrier to actual condom use [
34]. This clearly demonstrates the influence of cultural values and social norms in adopting safer sex behaviours [
35].
Understanding the reasons behind the refusal to use condoms is particularly important in South Africa because further improvement from its current level of use require innovative and targeted interventions.
The strongest predictor of condom refusal observed in this study, i.e. shame associated with condoms in interaction with other variables stresses the need for changing socio-cultural norms. The strong association of condom refusal with sexual risk behaviour, especially in the group where shame was not expressed, reporting multiple partners, and frequent partner change may require effective counselling.
The social norms and cultural values expressed as shame associated with condom use that may link using condoms to taboo behaviours such as promiscuous sex may lead to condom refusal even in the presence of other factors facilitating condom use (e.g., knowledge of HIV and condom, its availability and affordability and belief that condom can prevent HIV). Additionally, the in-depth exploration of ‘condom refusal’ identified sexual relationships where condom use may be perceived as less important because partners know their HIV status and live in stable relationships. Since heterosexual HIV transmission for both men and women often takes place within marriage or cohabitation, carefully tailored messages would also be needed here [
36].
The tree sensitivity and specificity for condom refusal are lower than the tree sensitivity and specificity for condom use, indicating that the variables used assist better in detecting condom users than condom refusers.
This study is subject to some limitations: Data on sexual behaviour were self-reported, thus a social desirability bias may apply, as is generally the case in studies using self-reported data to assess sexual risk behaviour. Another limitation is the issue of causality, as the study uses life time outcome measures with predictors measured at the time of study. Moreover, we did not ask how frequently respondents changed partners and this could have provided further useful insights.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
TMC prepared the research protocol, procured funding from Free State Department of Health for the study, conducted the household survey, prepared data for the analysis, and participated in writing and reviewing the manuscript. PC participated in conducting the research, analysed the data and participated in the report writing. NS was the supervising investigator, conducted the data analyses, wrote the first draft and was the lead in the reviewing process. DB, RC, CN, and BGW substantially contributed to the interpretation of data, drafting the article critically for important intellectual content, and approved of the final version. All authors read and approved the final manuscript.