Summary of findings and implications
CSRs have previously been considered as an ‘untapped’ source of detailed information relating to design, conduct and analysis of clinical trials [
1]. The value of the information within CSRs is becoming increasingly recognised within the academic research community, particularly within the Cochrane Collaboration [
20,
21,
29].
Publicly available summary data of clinical trial results from journal publications and trial registries may be suitable and sufficient to support some types of secondary analyses. However, an anonymised CSR provides complete information and data on study design and statistical methods, interpretation of results and the full set of endpoints’ results and statistics. Such a resource would certainly allow a more in depth appraisal including verification of numerical results, detailed assessment relating to bias such as selective outcome reporting and the conduct of novel analyses, such as systematic reviews and meta-analyses using data from all endpoints.
EMA Policy 0070 ‘Phase 1’, where anonymised CSRs are made public, is likely to further facilitate the secondary use of the information within CSRs for appraisal and secondary analysis. However, little consideration has been given to the data utility of the anonymised information within CSRs under EMA Policy 0070. The objective of this review was to investigate the type of research purposes and research methodologies employed in previous work using data from CSRs in academic research, and to hypothesize the impact of EMA Policy 0070 on the ‘data utility’ of future research of this kind.
Based on the number of requests made under EMA Policy 0043, we anticipate that academic researchers or research groups and the Pharmaceutical Industry are likely to be the primary recipients of anonymised CSRs under EMA Policy 0070. The research examples we discuss within this review indicate that the objectives and scopes of secondary analyses and novel research that have been conducted using CSR data are vast (Table
1, Additional file
1: Table S1). Authors of such research have communicated with us their concerns over the type of research that could be conducted in the future if information such as participant listings or narratives are redacted or removed completely under EMA Policy 0070 (Additional file
2).
Keeping the same conclusions and comparable numerical results of all primary, secondary and safety endpoints in the Anonymised CSRs to those of the original CSR (prior to any anonymisation) is of utmost importance. There are examples in the literature on how narratives are used to verify safety conclusions (see Additional file
1: Table S1). Handling of narratives, together with the handling of in-text listings, seems to be the most challenging aspect of EMA Policy 0070 from a technical standpoint and various levels of or approaches to anonymisation would further define different levels of data utility.
Certain free-text fields such as e.g. Adverse Events Reported Terms has been shown to be instrumental for certain secondary analysis to e.g. verify dictionary coding and conduct re-analysis [
12‐
14]. Further, preserving Subject IDs and Dates in an anonymised format in order to follow the events of a participant, using sequences and distances to further understand the drug safety profile. An ongoing review of CSRs published under EMA Policy 0070 indicates that free-text variables (such as narratives) tend to be fully redacted when a dictionary-coded variable is available and deemed to be better suited for analysis [
25]. The PhUSE De-Identification standard [
32] recommends as a primary rule in the case of pro-active release of data to follow such rational and a secondary rule to “Review and redact PII” in such free-text variables. It is therefore advised to researchers to make it clear in their requests in the context of their research objective whether certain free-text variables (with all PII redacted) are required, even if a dictionary-coded variable is available in the given data domain.
Two main general types of analysis emerged from this research: Appraisal and Secondary Analysis. The different objectives across these two general analysis types should help prioritise anonymisation methods from a data utility perspective in addition to data privacy considerations. The classification of research objectives also provides more guidance for developing a qualitative approach to assess and document data utility of anonymised CSRs in-line with EMA Policy 0070 [
17], Health Canada Public Release of Clinical Information Policy [
33] or other contexts.
Further understanding of the safety profile of the drugs and verification of how conclusions of clinical studies are derived would certainly provide added value for many stakeholders and data consumers, including patients themselves. However, several journal publications that were reviewed within this review and described in Additional file
1: Table S1 have had their findings challenged by the sponsoring pharmaceutical companies through comments on journal websites. Discussion of academic findings and interpretations should always be encouraged but there is a risk that ‘rapid-response’ additional analyses as a challenge to published research may confuse readers and secondary data consumers such as clinical practitioners, patients and patient associations who cannot interpret which of the many published results are the ‘correct’ ones. Bonini et al. 2014 [
34] also note that
“access to clinical data imposes a high ethical standard on anyone using those data, lest inappropriate reanalyses breed unjustified concern about the efficacy or safety of marketed drugs.” We (SJN and JMF) are of the opinion that communication (and potentially collaboration) between academic research groups and pharmaceutical companies regarding interpretations of regulatory documents such as protocols, statistical analysis plans and CSRs, and the interpretation of results of secondary analyses from their different perspectives during the research projects should be encouraged. Such communication and discussions occurring before any results of secondary analyses are published within journals are likely to provide the most informative novel results and in turn, provide the most benefit to readers and data consumers.
Limitations and future considerations
It must be emphasised that the examples of academic research using CSRs summarised within this review (Table
1 and Additional file
1: Table S1) are a selective sample and do not necessarily represent all research objectives which would make use of anonymised CSRs under EMA Policy 0070. Further, most observations provided to us by the authors of this research and our interpretations are rhetorical and speculative, rather than based on direct experience of anonymised CSRs published under EMA Policy 0070 and the validity of these observations may not become clear for some time.
The planned ‘Phase 2’ of EMA Policy 0070 which extends to sharing of individual participant data (IPD) should provide the next level of data utility that is required to conduct robust secondary analyses. A number of sponsors already provide access to anonymised IPD via data sharing platforms [
22,
26] based on research requests for studies under the European Federation of Pharmaceutical Industries and Associations (EFPIA) principle of responsible clinical trial data sharing [
35]. ‘Phase 2’ of EMA Policy 0070, when in effect, should in principle standardise the access to anonymised IPD for studies part of a central application in European Union regardless of the outcome of the application. The current needs of the research community which may include access to individual participant information such as full patient listings, which is out of scope of EMA Policy 0070 ‘Phase 1’, will be better addressed in ‘Phase 2’ of the policy where Individual Patient Datasets are in scope.
In the context of EMA Policy 0070, where anonymised CSRs are made public, a myriad of data recipient groups could be considered together with various objectives for reviewing and using the information within these anonymised CSRs. Their needs could differ from the academic research community and could be worth investigating at a later point.
In addition, at the time of writing, other regulatory agencies are defining, finalising and publishing their data transparency initiatives. FDA made an announcement in January 2018 [
36] and started pilots with pharmaceutical companies where redaction is the only anonymization method and full listings of participant narratives are out-of-scope [
37]. Health Canada started developing regulations around public access to clinical documents in 2017 and released a draft guidance for review in the second quarter of 2018 [
33]. Difference of requirements between EMA Policy 0070 guidance, FDA and Health Canada approaches (under development) [
38] may also lead to different anonymized versions of the same document in the public domain. Only when all policies are finalised will it become clear which versions under which jurisdiction serve best the needs of the academic community and others.