Background
Each year, thousands of people die from an
adverse drug reaction, defined as an undesirable health effect that occurs when medication is used as prescribed [
1]. Adverse drug reactions can vary from a simple rash to more severe effects, such as heart failure, acute liver injury, arrhythmias, and even death [
1]. These events have a significant impact on both patients and the health care system in terms of cost and health service utilization (e.g., frequent visits to physicians and emergency departments, hospitalizations) [
2].
Post-marketing adverse drug reaction surveillance in most countries is suboptimal and consists largely of spontaneous reporting. It is estimated that spontaneous reporting systems only capture 1–10% of all adverse drug reactions. For example, one out of every five physicians reports an adverse drug reaction using the Canada Vigilance Database [
3].
In order to advance
pharmacovigilance (defined as the science and activities related to detection, comprehension and prevention of adverse drug events) [
4], monitoring and analysis of data collected from social media sources (i.e., social media listening) is being researched as a potential to supplement traditional drug safety surveillance systems. Three reviews [
5‐
7] have been recently published to explore the breadth of evidence on the methods and use of social media data for pharmacovigilance; however, none of the reviews found rigorous evaluations of the reliability and validity of the data.
As this is a rapidly evolving field, we conducted a comprehensive scoping review to assess the utility of social media data for detecting adverse events related to health products, including pharmaceuticals, medical devices, and natural health products.
Methods
Research questions
The specific research questions were:
(1) Which social media listening platforms exist to detect adverse events related to health products, and what are their capabilities and characteristics?
(2) What is the validity and reliability of data from social media for detecting these adverse events?
Study design
We used a scoping review method to map the concepts and types of evidence that exist on pharmacovigilance using social media data [
8]. Our approach followed the rigorous scoping review methods manual by the Joanna Briggs Institute [
9].
Protocol
The Preferred Reporting Items for Systematic Reviews and Meta-analysis Protocols (PRISMA-P) [
10] guideline was used to develop our protocol, which we registered with the Open Science Framework [
11] and published in a peer-reviewed journal [
12]. The protocol was developed by the research team and approved by members of the Health Canada Health Products and Food Branch, the commissioning agency of this review. Since the full methods have been published in the protocol [
12], they are briefly outlined below.
Eligibility criteria
The eligibility criteria were any type of document (e.g., journal article, editorial, book, webpage) that described listening to social media data for detecting adverse events associated with health products (see Additional file
1: Appendix 1). The following interventions were excluded from our review: programs of care, health services, organization of care, as well as public health programs and services. Documents related to the mining of social media data to detect prescription drug misuse and abuse were eligible for inclusion.
Social media listening was defined as mining and monitoring of user-generated and crowd-intelligence data from online conversations in blogs, medical forums, and other social networking sites to identify trends and themes of the conversation on a topic (see Additional file
1: Appendix 2). We included documents that reported on at least one of the following outcomes: social media listening approaches, utility of social media data for pharmacovigilance and their performance capabilities, validity and reliability of user-generated data from social media for pharmacovigilance, and author’s perception of utility and challenges of using social media data.
Comprehensive literature searches were conducted in MEDLINE, EMBASE, and the Cochrane Library by an experienced librarian. The MEDLINE search strategy was peer-reviewed by another librarian using the PRESS checklist [
13], which has been published in our protocol [
12], and also available in Additional file
1: Appendix 3. In addition, we searched grey literature (i.e., difficult to locate, unpublished documents) sources outlined in Additional file
1: Appendix 4 using the Canadian Agency for Drugs and Technologies in Health guide [
14], and scanned the reference lists of relevant reviews [
5,
6,
15].
Study selection process
After the team achieved 75% agreement on a pilot-test of 50 random citations, each citation was independently screened by reviewer pairs (WZ, EL, RW, PK, RR, FY, BP) using Synthesi.SR; an online application developed by the Knowledge Translation Program [
16]. Potentially relevant full-text documents were obtained and the same process (described above) was followed for full-text screening.
Data items and data abstraction process
Data were abstracted on document characteristics (e.g., type of document), population characteristics of social media users (e.g., disease), characteristics of social media data (e.g., social media source), characteristics of social media listening approaches (e.g., pre-processing), and performance of the different approaches (e.g., validity and reliability of social media data). After the team pilot-tested the data abstraction form using a random sample of 5 included documents, each document was abstracted by one reviewer (WZ, EL, RW, PK, RR, FY, BP) and verified by a second reviewer (WZ, EL). The data were cleaned by a third reviewer (WZ, EL) and confirmed by the content expert (SJ, GH).
Risk of bias assessment or quality appraisal
Risk of bias or quality appraisal was not conducted, which is consistent with the Joanna Briggs Institute methods manual [
9], and those documented in scoping reviews on health-related topics [
17].
Synthesis of results
To characterize the health conditions studied, the World Health Organization version of the International Statistical Classification of Diseases and Related Health Problems (10th Revision, ICD-10) was used [
18]. The social media system characteristics were described and categorized according to the steps typically involved in a social media data processing pipeline [
19]. In addition, the social media systems were classified according to whether they were manual systems (i.e., coded by hand, without computer assistance), experimental/developmental stage systems (i.e., automatic information extraction systems being developed by researchers), or fully developed systems (i.e., automatic information extraction systems that are either commercially available or being used by regulatory agencies).
Descriptive statistics were performed (e.g., frequencies, measures of central tendency) using Excel 2010. Thematic analysis of open-text data was performed by two reviewers (WZ, EL) and verified by a third reviewer (ACT or SJ) to categorize the author perception of utility and challenges of using social media data for pharmacovigilance [
20].
Discussion
Most of the documents included in our scoping review dated from 2013 onwards. We identified seven pre-existing social media platforms, and another platform (Web-RADR Social) that is currently under development by European regulators. Unfortunately, no information on when this social media platform will be completed was provided. The majority of the documents primarily focused on the development of social media listening tools for pharmacovigilance (as opposed to their application), which would be useful for those interested in developing such platforms. In particular, documents authored by Freifeld et al., [
74] Karimi et al., [
83] and Vinod et al. [
19] provide useful information on the development of such platforms.
We identified 19 documents providing some information on the utility of social media. This information was mostly abstracted from the discussion section of the documents, suggesting that the conclusions were highly speculative. Furthermore, most of the included documents only followed social media posts for a median duration of 1 year. A high-quality study that examines utility over a longer timeframe with a broader data frame may provide further useful information to the field.
According to authors’ perceptions, social media can be used to supplement traditional reporting systems, to uncover adverse events less frequently reported in traditional reporting systems, to communicate risk and to generate hypotheses. However, challenges exist, such as difficulties interpreting relationships between the drugs and adverse events (e.g., there are inadequate data to draw causality), potential lack of representativeness between social media users and the general population, and the resource-intensive process of using social media data for pharmacovigilance. Evaluation studies of pharmacovigilance using social media listening are needed to substantiate these perceptions. Future studies should also consider evaluating the performance and utility of integrating social media data with other data sources, such as regulatory databases that collect spontaneous reports, as well as relevant surveillance databases.
Our results have summarized the most common elements involved in the processing of social media data for pharmacovigilance. Across the included documents, the most common steps employed were: 1) pre-processing; 2) de-identification; 3) de-duplication; 4) concept identification; 5) concept normalization; and 6) relation extraction. Validity and reliability findings varied across the different approaches that were used to mine the data, which suggests some may be more effective than others. The heterogeneous nature of the study designs and approaches reported in the documents; however, make it difficult to definitively determine which approaches are more useful than others.
As described in our protocol, we conducted this scoping review to inform members of Health Canada who are currently using our results to plan an evaluation study on utility of social media for detecting health product adverse events. They may also consider a Canadian platform to be developed in the future, depending on the results of their study.
Our results are similar to 3 other reviews on this topic. A recent review by Sarker et al. [
7] described the different automatic approaches used to detect and extract adverse drug reactions from social media data for pharmacovigilance purposes in studies published in the last 10 years. Although the authors characterized existing social media listening and analytics platforms, validity and reliability of the user-generated data captured through social media and crowd-sourcing networks were not examined. Golder and colleagues [
5] conducted a systematic review on adverse events data in social media. They found that although reports of adverse drug events can be identified through social media, the reliability or validity of this information has not been formally evaluated. Finally, Lardon and colleagues [
6] conducted a scoping review on the use of social media for pharmacovigilance. They identified numerous ways to identify adverse drug reaction data, extract data, and verify the quality of information. However, gaps in the field were identified. For example, most studies identifying adverse drug reactions failed to verify the reliability and validity of the data and none of the studies proposed a feasible way to integrate data from social media across more than one site/information source.
The strengths of our scoping review include a comprehensive search of multiple electronic databases and sources for difficult to locate and unpublished studies (or grey literature), as well as the use of the rigorous scoping review methods manual by the Joanna Briggs Institute. In addition, we included researchers with computer science expertise (SJ, GH) to help code automated approaches. In terms of a dissemination plan, we will use a number of strategies, such as: a 1-page policy brief, two stakeholder meetings (i.e., consultation exercises), presentations at an international conference, and publications in open-access journals. Team members will also use their networks to encourage broad dissemination of results.
There are some limitations to our scoping review process. The review was limited to documents written in English to increase its feasibility, given the 5-month timeline. Additionally, due to the large number of documents identified, the data were abstracted by one reviewer and verified by a second reviewer. Lastly, although our literature search was comprehensive, there is always a chance that some social media platforms or data analytics documents were missed. Since this is a rapidly evolving and emerging field, we expect that new documents fulfilling our inclusion criteria will be released in increasing numbers [
91,
92], highlighting a potential need to update our review in the near future.
Acknowledgements
We thank Dr. Elise Cogo for developing the literature search, Dr. Jessie McGowan for peer-reviewing the literature search, Ms. Alissa Epworth for performing the database and grey literature searches and all library support. We also thank Ms. Fatemeh Yazdi for helping with some screening and data extraction and Theshani De Silva for formatting the manuscript. Last but not least, we thank Jesmin Antony and Myanca Rodrigues, as well as Health Canada Products and Food Branch, in particular Dr. Laurie Chapman, for their feedback on the manuscript.