Background
Respiratory tract infection (RTI) covers a broad range of symptoms, and can cause millions of deaths worldwide [
1]. Although lists of common pathogens (such as
Streptococcus pneumoniae,
Staphylococcus aureus,
Klebsiella pneumoniae,
Haemophilus influenzae, and anaerobes) have been reported as causing typical pneumonia, practically, a broader spectrum of microorganisms can infect the human respiratory system and cause unexpected RTI especially in the immunocompromised patients [
2].
Recently, metagenomic next-generation sequencing (mNGS) was developed and shows its superiority in terms of unbiased microbial detection for the RTIs [
3,
4]. Clinical practice can benefit from the respiratory mNGS testing mainly from the following aspects: (1) detection of unexpected pathogens such as rare fungi in chronic pneumonia [
5], (2) rapid identification of fastidious pathogen, such as
Chlamydia psittaci, in acute and severe pneumonia supporting the termination of unnecessary administration of broad-spectrum antibiotics [
6], (3) rapid identification of slow-growing pathogens such as the mycobacteria and improving the effect of clinical precautions to prevent tuberculosis transmission; (4) identification of clinically non-cultivable virus allowing the improvement of antimicrobial stewardship programs; (5) comprehensive detection of multiple pathogens in pneumonia in the immunocompromised [
7], (6)screening opportunistic pathogens before non-antimicrobial treatment (
e.g., glucocorticoid inhalation), and ruling out infection in inflammatory airway diseases [
8]. Our former study, mainly focusing on lung infections, has demonstrated that, for cases where the microbial identification result from the conventional methods was inconclusive, mNGS leaded to 61% cases of diagnosis modifications and 58% of the cases of treatment adjustments [
9]. Besides, comparing to the conventional culturing method, the sensitivity of mNGS is less affected by antibiotic exposure [
10]. All the above advantages are clinically important for the diagnosis of the complicated respiratory diseases.
However, the output of mNGS data is like a pandora box, consisting of a complexity of microorganisms. The etiology is often mixed with contaminants and clinically insignificant colonizers, which provides challenges for the catchall data interpretation. Moreover, the respiratory tracts, one of the most complex sites in human body, is not a sterile body compartment, and harbors varieties of site-specific microbes in hosts of both health and disease conditions [
11]. Thus, the respiratory tract microbiome contains both commensals and pathogens making differential diagnosis the most difficult. As such, distinguishing legitimate pathogens from the normal microbiome is the central challenge of mNGS-based diagnosis for RTIs. In another way, studies integrating pathogen detection and microbiome characterization by mNGS should be carried out to boost the understanding of respiratory diseases [
2‐
4]. Only a few studies report mNGS-based microbiome characterizations [
12,
13]. Limitations remain in understanding the detected spectrum of bacteriome, virome and mycobiome of different airway samples in respiratory diseases [
14]. Moreover, the respiratory microbiome of patients under different immune status have not been fully characterized, although it has been known that transplant patients have higher virome diversities, with both non-pathogenic and pathogenic viruses co-existing in a high degree [
15]. The microbiome is supposed to affect populations of different immune status disproportionately.
On the other hand, multiple respiratory specimen types [nasopharyngeal aspirate, oropharyngeal swab, sputum, bronchoalveolar lavage fluid (BALF), pleural effusion, biopsy lung tissue, etc.] represent different airway conditions, which demand for different standards of mNGS data interpretation [
16]. Our previous study reveals that appropriate choosing of respiratory specimens and data interpretation based on pathogen types of common bacteria (non-mycobacterium), mycobacterium and fungi can reinforce mNGS data interpretation [
9]. In addition, bioinformatics-associated thresholds should be carefully implemented for different specimen types to differentiate the identified organisms into the etiologic agents, potential pathogens, contaminants and/or commensals [
17]. All in all, by choosing of suitable specimen types and building-up of the mNGS data interpretation standards for RTI diagnosis are worth thinking deeply [
18].
Based on the above research gaps, this study was carried out to compare the mNGS diagnosis values using four respiratory specimen types, and characterize the respiratory microbiome compositions based on the most suitable specimen type. Additionally, specimen-specific and pathogen-type-specific standards for mNGS data interpretation were implemented and the feasibility of the threshold-based data interpretation pipeline was evaluated.
Discussion
The inherent complexity of respiratory specimens presents unusual challenges to mNGS data interpretation, as colonizers, contaminants and clinically insignificant organisms may confound the identification of true pathogens. In order to optimize the mNGS diagnosis for RTIs, based on our experience of clinical practice, the key issue was to find the most suitable specimen type. So, here in this study, we compared specimens of sputum, BALF, lung tissue and pleural fluid simultaneously in terms of pathogen identification. Moreover, subgroupings of infection types and patient cohorts were incorporated into consideration for microbiome characterization and mNGS data interpretation standardization in this metagenomic study.
In general, the supremacy of BALF for pathogen identification with high PPV values has been observed [
11]. One of the possible explanations, as revealed by our representative cases in Fig.
2e, g, is that BALF is less affected by the non-pathogenetic microbes from the upper airways such as
Candida and
Veillonella in sputum, and contains higher pathogen loads as shown by Fig.
2d, f [
7]. Also, this is the first study revealing the microbial composition in BALF covers almost the full spectrum of microbes detected in the other specimens (Fig.
3c). Differences between BALF and the other specimens in its background microbial community have been identified, and the microbial composition between specimens is noninterchangeable. The background microbiome in BALF is possibly resulting from the oral commensals (sputum-like), local microbiota (lung tissue and pleura fluid), and the bronchoscopy contaminants (Fig.
2e, g). All in all, this study demonstrates that the good efficiency of BALF in mNGS testing in two aspects. The first is that the pathogen abundance in BALF is high and is less affected by the common flora, and the second is the microbe spectrum detected in BALF is the widest among the other respiratory specimen types. Hence, although tracheoscopy is challenging and could be refused by patients, we recommend patients, especially those with suspected NTM or
Aspergillus infections, to have their BALF sampled to avoid ambiguous mNGS reports. Rigorous adherence to disinfection and sterilization standards when performing bronchoscopy procedures is also strongly recommended to minimize the effects of the background microbes.
Although mNGS using BALF shows higher sensitivity in detecting NTM, the sensitivity for MTB detection is poorer than sputum, lung tissues, and even the pleural fluid [
21]. This is in line with the previously observed trend that sputum is more sensitive for TB diagnosis [
9]. The exact reason is not clear, but might be the pathogenicity and biology difference of the two categories of mycobacteria. The main route of MTB transmission is through inhalation of aerosols from patients, indicating MTB might commonly colonize the upper airways [
22]. In contrast, NTM species are environmental and opportunistic pathogens, which cannot be transmitted between individuals and rarely causes human diseases unless in immunocompromised hosts, indicating the NTM load could be higher in the lower airways.
It is unexpected that the mNGS detection rates for NTM are lower than the rates of conventional methods (Fig.
2c). This is mainly due to the latest diagnosis guidelines for NTM lung disease, recommending that patients who are highly suspected to have NTM infections should be diagnosed [
23]. This is to make the globally increasing burden of the hard-to-detect NTM infections to be noticeable [
24]. In China, additional PCR assays as complementation tests for the mNGS detection of NTM have been increasingly prevalent to capture the mNGS-missed opportunist pathogen in healthcare settings [
25]. However, the exact reason for the detection difficulty of NTM is currently unknown, possibly due to the microbiological and the host’s immunological difference towards NTM and MTB. We also notice a relatively poor performance of mNGS in the identification of
Cryptococcus, as the detection sensitivities using the serum cryptococcal capsular polysaccharide antigen (CrAg) test and the computed tomography (CT) features of pulmonary cryptococcosis are higher [
26]. So, most of the
Cryptococcus cases in this study were successfully diagnosed using the conventional methods instead of mNGS.
Albeit normally sterile, pleural fluid gives poorer diagnosis performance in bacteria identification. One of the main reasons is the low microbial loads in the sterile but neutrophil-rich body fluid [
27]. Pleural effusions are mainly built up by host inflammation reactions. Another reason is the incidence of pleural infection is limited (approximately 8 cases per 100,000 people), and pulmonary infections occasionally induce peripheral pulmonary lesions by common Gram-positive and Gram-negative bacteria [
28]. The PPV of pleural fluid in mycobacteria detection is higher because of the high incidences of tuberculous pleurisy in our hospital.
The human respiratory microbiome composition is highly associated with specimen types, host health status, and infection etiologies [
8,
15]. So, here in this study, in addition to pathogen identifications, we explore the information given by the mNGS data harder, and characterize the microbiome features in different specimens and populations to facilitate differential diagnosis of complicated infections using mNGS. Our results exhibit the microbial composition in immunocompetent patients is more divergent (Figs.
4b,
5b). As for mycobacteria in Fig.
4c, more relevant microbes are in MTB cases rather than the NTM cases, which can be due to the greater amount of bacterial burden and virulence in MTB cases comparing to the NTM cases [
29]. Regarding to the tumor bacteriome,
Veillonella,
Streptococcus,
Prevotella and
Haemophilus, which are common in patients with idiopathic pulmonary fibrosis and bronchiectasis are identified, different from the species composition in cystic fibrosis patients carrying
Pseudomonas aeruginosa,
Staphylococcus aureus, and
Burkholderia [
11].
Another microbiome analysis highlight is the virome. HHVs are commonly identified in this study, especially in the immunosuppressed patients [
15]. Indeed, critically ill patients, such as the COVID-19 patients with poor immune status, may have multiple episodes of virus infections [
12]. Similarly, immunocompromised patients have higher possibilities of virus colonizing [
30]. A higher proportion of viruses and a relatively high proportion of TTVs are observed in the transplant patients, supporting the trend of virus co-existing in transplant patients and the suggestion of using TTV as a host immune status indicator [
31]. More importantly, two virus species [
i.e., HHV-1 (HSV-1) and EBV] with regards to tumor patients are pinpointed by the logistic regression analysis, showing varied effects of antineoplastic treatment on hosts [
30].
The application of clinical mNGS has led us to the era of precise and individualized medicine, however, the technique can simultaneously detect both true pathogen and the clinically insignificant microbes [
16]. A comprehensive view of potential false-positive (FP) mNGS pathogen results has been shown for each specimen type, ranging from oral normal flora in sputum and environmental contaminants and skin commensals in lung tissues and pleural fluid [
32]. The airway microbiota in BALF cover almost all microorganisms present in the other specimen types with relatively low RARs, suggesting the FPs could be filtered out by the application of bioinformatic threshold for etiology diagnosis [
16]. The new issue of optimizing mNGS in clinical diagnosis is to determine the etiological pathogen accurately and automatically. So, we test several parameter combinations, and achieve comparative result with the results given by the experienced clinicians, albeit still challenging to build a fully-automatic analysis pipeline.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.