Background
Community-acquired pneumonia (CAP) is a leading cause of morbidity and the first cause of mortality by infectious disease in the European region. Approximately 3 million cases of CAP are reported annually in Europe, of which one-third are hospitalized [
1]. Infectious pneumonia corresponds to the multiplication of microorganisms in alveoli responsible for a combination of non-specific pulmonary and general symptoms, with numerous alternative diagnoses [
2]. Due to the deep localization of the infection, microbiological data, which may help to establish a diagnosis, is reported in less than 50% of patients and the diagnosis and aetiology frequently remains uncertain [
3]. Thoracic-CT scan, which has proved its efficacy in confirming or invalidating the diagnosis of CAP, is probably to date the best non-invasive tool to confirm the diagnosis of CAP [
4‐
6]. However, its cost and limited availability compared to chest X-ray prevent its widespread use as a diagnostic criterion.
Due to heterogeneity of clinical presentation, there are to date neither universal diagnostic criteria to define CAP, nor a validated diagnostic classification. Consequently, CAP definitions differ according to country, medical specialty and practice guidelines [
7‐
9]. However, the lack of universal CAP diagnostic criteria might have consequences for clinical practice, epidemiological analyses, and also validity of randomized controlled trials (RCTs). In this context, a true and powerful evaluation of an intervention obviously requires the inclusion of a sufficient number of individuals truly presenting the targeted disease and representative of the entire infected population. This is even more true in non-inferiority trials, a frequently-used methodology in CAP RCTs [
10], in which the inclusion of patients without the targeted disease might result in inaccurate estimation of the difference between arms of studies and an incorrect conclusion of non-inferiority of the evaluated intervention. Therefore, an assessment of existing diagnostic criteria for CAP is essential for clinical research.
In the present study, our principal objective was to assess the heterogeneity of CAP diagnostic criteria used in RCTs and to identify different patterns of CAP inclusion criteria. Furthermore, from the results of a previously-conducted study [
4], establishing the diagnosis of CAP based on all currently available data and also on early systematic thoracic CT-scan, we assessed performance (i.e. sensitivity, specificity, positive predictive value and negative predictive value, likelihood ratios) of different CAP inclusion criteria patterns when applied to our reference population [
4]
.
Methods
Selection and data extraction from randomized controlled trials
We searched ClinicalTrials.gov using the key words « community-acquired pneumonia » (last connection on October 1st 2018). The eligible trials were RCTs including adults with CAP independent of the trials’ primary objective. Trials including a paediatric population, severe CAP and those withdrawn before enrolment were excluded.
The following data were extracted from the ClinicalTrials.gov website: Clinical Trial Number (NCT); declaration, start and completion dates; type of sponsor (industrial or academic); study design; primary objective of the study; extensive list of inclusion criteria (including the CAP diagnostic criteria); presence of a CAP severity score (Pneumonia Severity Index or CURB 65) and the list of exclusion criteria. The study design included the number of centres, national or international recruitment, the location of inclusion (community or hospital), the blinding method (simple/double-blind or open-label trial), the superiority or non-inferiority design, and the field of the study (evaluation of diagnostic criteria, evaluation of biomarkers, choice or duration of antibiotic treatment). Data collection was performed independently by two investigators (JLB and CF). Disagreements were resolved by consensus.
We combined different methods to collect CAP diagnostic criteria. First, we collected data available on the ClinicalTrials.gov declaration. We also systematically contacted the investigators by email to confirm and complete their CAP diagnostic criteria. A second e-mail was sent 1 month later to the non-responders. Finally, when CAP diagnostic criteria were not available, we searched PubMed for publications and looked for CAP diagnostic criteria in the full article, when available.
Diagnostic criteria for CAP
For each trial, criteria used to establish CAP diagnosis were collected. To allow the comparison, these criteria were categorized as 1) respiratory symptoms (dyspnea, chest pain, cough, sputum), 2) pulmonary auscultation abnormalities, 3) general symptoms (fever, malaise, chills), 4) biological criteria (leucocytosis, C-reactive protein increase, procalcitonin increase) and 5) thoracic radiological criteria.
For each trial, we specified which criteria were mandatory or optional to establish the CAP diagnosis, and their combination. As there were numerous combinations of mandatory and optional criteria to define CAP (and therefore to include CAP patients), we gathered CAP definitions with quite similar criteria. In this report, we will use the term “CAP definition patterns” to refer to these different patterns of CAP inclusion criteria.
Reference population
To assess the performances of each CAP definition pattern, the different CAP definition patterns were applied to the “ESCAPED” database. ESCAPED is a prospective, multicenter, interventional study which assessed the impact of a systematic thoracic CT-scan on the diagnosis of CAP in patients visiting the emergency department for a suspected CAP [
4]. Patients were included in the database if 1) they presented at least one symptom of systemic infection (temperature > 38 °C or < 36 °C, heart rate > 90/min, respiratory rate > 20/min) AND 2) one recent respiratory symptom (cough, lateral chest pain, sputum, dyspnea, localized crackles) AND 3) had had a chest radiography AND 4) the clinician suspected CAP. The inclusion criteria did not include chest X-ray infiltrate, which made possible the inclusion of true CAP without infiltrate on chest X-ray, for which pulmonary involvement was secondarily established by thoracic CT-scan.
For each of the 319 included patients, an adjudication committee composed of a senior specialist in radiology, pneumonology and infectious diseases established the diagnostic probabilities of CAP according to a 4-level Likert scale (definite; probable; possible; excluded) from all available data at day 28 post-diagnosis. Based on the evaluation of the adjudication committee, there were 150 definite CAP cases, 13 probable CAP cases, 34 possible CAP cases, and 122 excluded CAP cases [
4]. For the present study, we grouped definite and probable CAP in a single category and considered 163 patients as having CAP, compared to 156 who were considered as not having CAP.
The comprehensiveness of clinical, biological and radiological data collected in the ESCAPED database, including all of the inclusion criteria used in the RCTs identified in the literature, allowed us to apply each CAP definition pattern to this database. For each CAP definition pattern, we aimed to determine if it accurately identified patients with CAP from the reference population of the ESCAPED database. We determined, among the 319 patients of the ESCAPED database, the number of patients who would have been included or excluded in the RCT, according to each CAP definition pattern. For each patient, we already knew if he was considered as having “definite or probable CAP”, and “excluded or possible CAP” by the adjudication committee.
This resulted, for each CAP definition pattern evaluated, in a number of “true positives” patients (i.e. considered as having CAP by the evaluated CAP definition pattern and by the adjudication committee), a number of “false positive” patients (i.e. considered as having CAP by the evaluated CAP definition pattern but not by the adjudication committee), a number of “true negative” patients (i.e. considered as not having CAP by the evaluated CAP definition pattern and by the adjudication committee) and a number of “false negative” patients (i.e. considered as not having CAP by the evaluated CAP definition pattern but as having CAP by the adjudication committee).
Statistical analysis
For each CAP definition pattern, after determination of the number of “true positive”, “true negative”, “false positive” and “false negative” patients, we calculated sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and positive and negative likelihood ratios for the diagnostic of CAP. All tests were two-sided, and p-values below 0.05 were considered to denote statistical significance. All statistical analyses were performed with SAS software (version 9.3).
Discussion
In this study, we report the considerable heterogeneity of CAP inclusion criteria in RCTs and, by applying these criteria to a reference population, explored through systematic CT-scan, we underline the potential risk of inclusion of patients without CAP, in RCTs focused on CAP.
We show that the heterogeneity of CAP inclusion criteria applies to the number of criteria, their type (pulmonary or general symptoms, biological criteria, radiological abnormalities), their optional or mandatory nature, their multiple combinations, and was not explained by the characteristics of the trials, their objectives, their methodology or their sponsor.
When applied to a unique reference population of patients suspected to have CAP visiting the emergency department, this variety of CAP definition patterns resulted in variations of the number of patients identified as having CAP and their diagnostic authenticity. Furthermore it affected the proportion of patients with or without definite CAP who would be included in RCTs using these inclusion criteria (variation by a factor of 6 of the number of false positive patients). This variability in the included population has implications in terms of the trials’ internal and external validities. Indeed, the evaluation of therapy in individuals who do not have the disease negatively impacts the internal validity of both non-inferiority and superiority trials. Considering non-inferiority trials (the majority of those analysed), factors that result in smaller differences between study groups will lead to the false conclusion that the new treatment is not inferior to the gold standard [
11‐
13]. This is particularly true for CAP definition patterns with a high rate of false positive CAP. Considering superiority trials, including patients without CAP would result in a loss of statistical power to demonstrate the superiority of a potentially effective therapy.
This heterogeneity in CAP definitions also calls into question the external validity (application of the results) and furthermore prevents future comparisons of these different trial results which, even if they share the same objective and the same declared recruited individuals (CAP patients), will draw conclusions from populations with non-comparable characteristics. This may very well also explain why conclusions of different CAP RCTs differ so much in the literature [
13], and limited the conclusions based on the metaanalysis of such studies.
Our study has some limitations. First, we did not test the performance of each of the 42 combinations of diagnostic criteria in the database. We grouped them into eight CAP definition patterns for analysis, which might have resulted in an under-estimation of the heterogeneity of the diagnostic performance of the diverse CAP definitions. Second, the choice of our reference population may have biased the analysis. However, the choice of a reference population to evaluate the performance of diagnostic criteria inevitably results in a selection bias. We believe that the ESCAPED population is the more acceptable reference population, as data collected in the database included all the diagnostic criteria of the eight CAP definition patterns, and the adjudication committee based its judgment on all available data including a thoracic CT-scan and a 28-day follow-up. In this population, 27% of the CAP diagnoses established based on the existence of an infiltrate on chest X-ray were finally excluded by thoracic CT-scan evaluation [
4]. Third, we considered in our analysis that “confirmed” and the “probable” CAP patients based on the adjudication committee classification had definite CAP. However, considering only confirmed CAP patients as definite CAP gave similar results (data not shown). Finally, we based our definition of true CAP considering CT-scan as the adequate gold standard for the diagnosis of CAP. It remains to date the most useful non-invasive technique to establish and/or exclude pneumonia diagnosis, [
6].
The question arises whether our results allow the identification of the most accurate combination of inclusion criteria for RCTs. The CAP definition patterns 1, 6 and 7, provided excellent specificity but their low sensitivity prevents their common use in clinical research. The three most sensitive CAP definition patterns – displaying quite similar performances - are already the most used in RCTs, attesting to the pragmatism of investigators. The issue is to improve their specificity. We suggest that the per-protocol analysis of any CAP trial should be performed on the sub-population of patients whose CAP diagnosis is established a posteriori by an adjudication committee, taking into account all available data, including microbiological results and follow-up, and if possible a thoracic CT-scan.
This “two-step” strategy would have the merit of being appropriate in every situation: when high sensitivity of the diagnosis is the priority in the context of urgent CAP care, as delay in antibiotic treatment is associated with poorer prognosis, but also in therapeutic trials, where more stringent inclusion criteria are needed. This strategy would increase the validity of both epidemiological studies and randomized clinical trial results, and make possible the comparison of the results.
Acknowledgements
Escaped study group:
Scientific committee: Steering committee— Y.E. Claessens, (MD PhD, principal investigator), X. Duval (MD PhD, co-principal investigator), E. Bouvard (MD); M.F. Carette (MD PhD); M.P. Debray (MD PhD); C. Mayaud (MD PhD); C. Leport (MD PhD); N. Houhou (MD PhD); S. Tubiana (PharmD).
Adjudication committee: M. Benjoar (MD), F.X. Blanc (MD PhD), A.L Brun (MD), L. Epelboin (MD), C. Ficko (MD), A. Khalil (MD PhD), H. Lefloch (MD), JM. Naccache (MD PhD), B. Rammaert (MD PhD).
Clinical investigators: A. Abry (MD), J.C. Allo (MD), S. Andre (MD), C. Andreotti (MD), N. Baarir (MD), M. Bendahou (MD), L. Benlafia (MD), J. Bernard (MD), A. Berthoumieu (MD), M.E. Billemont (MD), J. Bokobza (MD), A.L. Brun (MD), E. Burggraff (MD), P. Canavaggio (MD), M.F. Carette (MD PhD), E. Casalino (MD PhD), S. Castro (MD), C. Choquet (MD), H. Clément (MD), L. Colosi (MD), A. Dabreteau (MD), S. Damelincourt (MD), S. Dautheville (MD), M.P. Debray (MD), M. Delay (MD), S. Delerme (MD), L. Depierre (MD), F. Djamouri (MD), F. Dumas (MD), M.R.S. Fadel (MD), A. Feydey (MD), Y. Freund (MD), L. Garcia (MD), H. Goulet (MD), P. Hausfater (MD PhD), E. Ilic-Habensus (MD), M.O. Josse (MD), J. Kansao (MD), Y. Kieffer (MD), F. Lecomte (MD), K. Lemkarane (MD), P. Madonna (MD), O. Meyniard (MD), L. Mzabi (MD), D. Pariente (MD), J. Pernet (MD), F. Perruche (MD), J.M. Piquet (MD), R. Ranerison (MD), P. Ray (MD PhD), F. Renai (MD), E. Rouff (MD), D. Saget (MD), K. Saïdi (MD), G. Sauvin (MD), E. Trabattoni (MD), N. Trimech (MD).
Monitoring, data management and statistical analysis: C. Auger (RN), B. Pasquet (MD), S Tamazirt (RN), J.M. Treluyer (MD), F.Tubach (MD), J. Wang (RN).
Sponsor of the ESCAPED study: Assistance Publique-Hôpitaux de Paris, Délégation Interrégionale à la Recherche Clinique d’Ile De France, O. Chassany (MD), C. Misse (MD).
Revision of the article: Pr Martin O. Savage