Principal results
Our study investigated how Ebola-related information diffused on Twitter using concepts from network analysis. We demonstrated the coexistence of two diffusion models of Ebola-related information on Twitter. The broadcast model represents one-to-many diffusion, while the viral model represents a chain of individual-to-individual diffusion. We found that the broadcast model was dominant in Ebola-related Twitter communication. Like the viral model, the broadcast model could also generate large information cascades. Furthermore, we found that influential users and hidden influential users could trigger more retweets than disseminators and common users. Disseminators and common users primarily spread information via the broadcast model. The disseminators’/common users’ tweets reached their followers, but only a small fraction of their followers retweeted them. If disseminators and common users were going to spread information beyond their immediate followers, they relied on influential and hidden influential users to retweet their tweets. If many of a disseminator’s /common user’s followers were influential or hidden influential users, then viral spreading might occur. The influential users retweeted the disseminator’s/common user’s tweets and then reached all of their followers. In this sense, it starts as a broadcast model (one-to-many) and then turns into a viral model (a chain of individual-to-individual).
Our study contributes to the existing literature in several ways. First, a previous study found that news media coverage, instead of individual-to-individual communication, dominated the dynamic patterns of Ebola-related Twitter activity in the US [
2]. Our finding is consistent with their mathematical model in general – broadcast model is pervasive. However, our analysis at the micro diffusion level suggests that viral spreading still has its unique roles. Even though mainstream media and health organization accounts (such as BBC, CDC, and WHO) were very influential in terms of triggering information cascades, most influential users were not media or health organizations. They could be celebrities (e.g., Barack Obama, Bill Gates) or sports organizations (e.g., FC Barcelona). In fact, the media accounts could only account for a small proportion of all retweets in our data set. The discrepancy could be caused by the units of analysis. Towers et al.’s analyses [
2] were at the aggregate level and the impact of media coverage was estimated including indirect effects. It is plausible that most of the celebrities or sports organizations in our data set actually were led by media coverage; however, the effect was not visible on Twitter. Second, our analysis was not limited to the differentiation of broadcast or viral diffusion models on Twitter. We introduced the identification of influential users [
7] to extend previous studies on Ebola-related Twitter data. We found that broadcast and viral models were effective for different user types. Influential users and hidden influential users were more likely to create broadcast diffusion, whereas common users and disseminators were more likely to create viral diffusion. Finally, extending the concept of structural virality introduced by Goel et al. [
4], we developed a normalized version of structural virality. The normalized structural virality will not depend on the cascade size intrinsically and can be used to analyze information cascades of all types of information across different social media platforms.
Our findings are important as they may inform how we may formulate public health communication strategy during outbreak emergency responses. If a certain type of information is more likely to diffuse via the broadcast model, it could be strategically advantageous to work with influential users and hidden influential users who can attract a large number of retweeters directly. However, if the information is more likely to spread virally, developing a successful strategy gets more complicated because viral diffusion depends on the structure of the underlying social networks. For example, information in a cohesive network – where users are well-connected with each other – spreads relatively fast [
11]. One strategy for health communication would then be to identify cohesive sub-communities within a network and then spread the information in each sub-community. However, we usually do not know the whole network structure on social media platforms and therefore, the identification of sub-communities within a network may not be feasible.
Through a retrospective observational study of Ebola-related Twitter data, our analysis showed that the broadcasting model was dominant on Twitter for tweets pertinent to an emerging infectious disease outbreak, and that the broadcasting model could generate large information cascades. This finding suggests that public health practitioners may be able to rely on the broadcasting model for large-scale dissemination of public health information during outbreak emergency responses. Although it is widely believed that the viral spreading model is popular on Twitter, it is not empirically supported in our analysis of Ebola-related tweets. Viral information cascades on Twitter are rare events that public health agencies would not build communication strategies around them.
Given that the Twitter handles of many established public health agencies have more followers than followees, these Twitter handles are either “disseminators” or “influential users.” The practical question raised by health communication practitioners is how they can turn their Twitter handles from “disseminators” to “influential users” by attracting more retweets. Given the pervasiveness of the broadcasting model as observed in the retweeting patterns of Ebola-related tweets, establishing a large follower base (as did many CDC Twitter handles) appears to the most straight forward answer.
However, an outstanding question remains: how can we communicate our health messages to Twitter users who have no interest to follow public health agencies’ handles? If the broadcast model of information diffusion prevails, public health agencies’ messages would hardly ever reach these Twitter users. Our results suggest that future efforts would need to be able to identify seed users who have the ability to trigger large-scale information cascades. Our findings suggest that influential users and hidden influential users are likely to be the most important seeds. However, to collaborate with the influential users with many followers (such as celebrities) to support the cause of a specific health communication campaign may not always be the public health agencies’ priorities.
Hidden influential users would be the alternatives, as they can induce large-scale cascades beyond our expectation. However, another set of questions emerge: (a) How can we identify these hidden influential users? Can they be identified prospectively? (b) What make these Twitter users “hidden influential”? Are these users necessarily individuals or organizations with whom public health agencies should engage?
Classification of Twitter users in Table
4 is retrospective in general; however, knowledge gained from a previous outbreak may be applied to any current outbreak emergencies. However, further validations are required in future studies to ascertain user classification. The prospective identification of hidden influential users at the early stage of the communication process and the subsequent collaboration with them to propagate health messages are possible in theory but challenging in practice given the amount of work that is required to perform such analysis. The nature of the “hidden influential users” also requires our attention. Did they simply by chance write an Ebola-related tweet that became viral? Or are they individuals who are masters of online communication and can write tweets in a way that health organizations cannot? Published scholarly literature on Ebola-related Twitter data provides some insights into these highly viral tweets and who these “hidden influential users” are. Vorovchenko and colleagues [
12] found that “humorous accounts” had a lot of engagement during the Ebola crisis, especially during October 2014 when Ebola cases were diagnosed in the United States. Our team’s own qualitative analysis also found that about one in four Ebola-related tweets in our dataset was either a joke or irrelevant to public health (unpublished data). Prior research on Twitter data pertinent to the 2009 H1N1 pandemic also identified humorous tweets in 8% of their sample [
13]. The “hidden influential users” identified in our current study might be individuals who wrote jokes about Ebola on Twitter. These humorous tweets resonated with the emotions of many Twitter users at a juncture when many Americans were anxious about their own perceived risk of being infected with Ebola, and these tweets became viral. However, whether public health agencies should use humor in their Twitter communication to enable their tweets having a viral effect is a matter subject to debate. Given that the reputation of the government and the public health sector at large is at stake, health communicators are likely to exercise extreme caution as they approach this suggestion.
It is worth noting that the time frame of 435 days of our data surpasses many published analyses of Ebola-related tweets. As highlighted in a 2016 review, the vast majority of published Ebola-related social media studies were analyses of data from a very short time frame [
14]. As described by Fung et al. and Towers et al. [
1,
2], Twitter users’ attention to the West African Ebola outbreak were minimal prior to Ebola cases in the U.S. and their interest in this topic dropped off afterwards. While the cut-off point of May 31, 2015 was arbitrary (as the data was purchased in early June, 2015), our analysis encompassed the Ebola-related Twitter activities before, during and after the waves of attention to this topic that was prominent in October 2014.
Limitations and future directions
First, the present study found that there is little difference between broadcasting and viral spreading models in terms of the number of retweets received. However, it remains unknown whether there are differences in terms of “reach” (the potential number of individuals exposed to the message), attitudes, and behavioral change. For example, some scholars claimed that interpersonal communication is more effective for behavioral change [
6]. In addition, the “homophily” mechanism makes similar users gather together [
15]; for example, users who follow CDC official account on Twitter (@CDCgov) may be more similar to each other than those who do not. In this way, broadcasting may reach similar users, whereas viral spreading may reach heterogeneous users across different communities on social media platforms [
8]. In this sense, although broadcast model is predominant, viral spreading may be more beneficial for reaching diverse users. However, the lack of demographic data pertinent to Twitter users prevent us from further knowing the user diversity, and thereby limits the generalizability and interpretability of the findings.
Second, this is a case study of Twitter information specific to Ebola. Our findings are consistent with previous studies using general tweets [
4]. However, it is unknown whether the patterns will hold across different topics. For example, does Zika-related information diffuse on Twitter differently than that of Ebola-related information [
16]? Following a similar line of thought, while prior cross-sectional studies categorized contents Ebola-related tweets and manually identified Ebola misinformation [
17], future research may study whether Ebola-related misinformation spreads differently on Twitter networks compared with correct scientific information. Prior study has identified a difference between the response ratio of Twitter users (the number of individuals exposed to a piece of information divided by the number of individuals taking the action to retweet it or choosing not to retweet it) for 3 news stories and 10 rumors related to Ebola [
18]. In terms of prevalence, structural virality, spread, retweets, and other quantitative measures, are there any significant differences between misinformation and scientific information? A study of publicly available Facebook data found that scientific information differed from conspiracy theories in terms of cascade dynamics [
19]. Addressing these issues will allow public health communicators to identify and address misinformation.
Third, even though identifying the hidden influential users to assist in the diffusion of public health messages on Twitter could potentially be more effective than encouraging influential users to share critical public health information, we employed an ad-hoc approach to identify them in the current study. Can we identify hidden influential users on Twitter (or other social media) prior to or during an emergency response? In this study, we identified many media and health organizations that were influential users. However, we also found that most of influential users were not media or health organizations. Future studies are required to find a more convenient and efficient way to identify hidden influential users.
Finally, the present study found that the broadcasting model was dominant among Ebola-related tweets. However, we do not know whether the combination of broadcasting and viral spreading strategies can facilitate the diffusion of health information beyond the additive effect.