Introduction
Thyroid nodule (TN) is a largely diffused and often incidentally discovered pathological entity. Since the vast majority of TNs is benign, the first aim in clinical practice is to exclude cancer, and ultrasound (US)-guided fine-needle aspiration cytology (FNAC) is pivotal in this context [
1,
2]. In fact, with the exception of indeterminate and inconclusive cases accounting as a whole for 20–30% of all FNACs, cytological examination can accurately discriminate samples without cancer features and consistent with benign lesions from specimens consistent with or suspicious for malignancy. On the International scene, two major guidelines for reporting and classification of TN FNAC exist, the UK Royal College of Pathologists (RCPath) [
3], and the most widely used system such as The Bethesda System for Reporting Thyroid Cytopathology (TBSRTC) [
4]. In these guidelines, TN FNAC samples suspicious for malignancy are classified as Thy 4 and Bethesda V, while those specimens diagnostic for malignancy as Thy 5 and Bethesda VI, respectively. In addition to the above most recognized guidelines [
3,
4], the Italian consensus for the classification and reporting of thyroid cytology (ICCRTC) was initially proposed in 2010 [
5] and then updated in 2014 [
6]. In this Italian proposal, TN FNAC suspicious for malignancy is classified as TIR4 while that consistent with malignancy as TIR5. Because of the high/very high expected risk of malignancy in these two FNAC categories (i.e., 60–80% in TIR4, >95% in TIR5), surgery is always indicated for such patients. Furthermore, it is estimated that the frequency of TIR4 and TIR5 cases among all FNACs accounts for 5% and 4–8%, respectively. These figures were estimated based on sparse data or findings reported in the other guidelines [
3,
4]. One single previous meta-analysis exists on this topic [
7], only six studies were included, and a small number of cases was pooled, i.e., 589 nodules of which 203 TIR4 and 386 TIR5. The pooled cancer rate was 85% and 99% in TIR4 and TIR5, respectively, with no heterogeneity; however, it is worth noting that the rate of individuals undergoing surgery, a factor with potential to influence the cancer rate, was neither analyzed nor extracted. Since these findings are at variance from that estimated in the ICCRTC guidelines, and their reliability is hampered by the limitations in the data measures, a revision of the literature is warranted to confirm or not these results.
The present systematic review was undertaken to achieve more robust information about FNAC report of TIR4 and TIR5 according to ICCRTC. In particular, the present study aimed to achieve high-evidence estimates of risk of malignancy of these categories, also considering the operation rate and other potential influencing factors.
Discussion
FNAC is pivotal to plan the optimal management of TN patients. In fact, we generally evaluate these patients by ultrasound to select patients eligible to FNAC and, then, we usually recommend surgical treatment when cancer is suspected on cytological preparations. When differentiated thyroid carcinomas are classified at intermediate-to-high risk, international guidelines agree in considering total thyroidectomy with postoperative radioiodine therapy to achieve a complete remission. Facing lower risk patients, experts agree that a less extended approach may be safely managed by more conservative approaches, ranging from total thyroidectomy without radioiodine administration or lobectomy to active surveillance [
2,
27]. However, several factors could influence the optimal approach to individual patients, and a proper pre-surgical risk stratification still remains a challenge. Indeed, biopsy cannot assess the histological features consistent with aggressive subtypes of cancer. In addition, while molecular testing has been suggested by some as a possible fix for this issue [
2], its efficacy can still be disappointing [
28]. To make matters worse, prediction of the FNAC-based risk of malignancy itself can be a challenge. In the three most diffused cytological systems we can find two categories of suspicious for and diagnostic of malignancy, such as V and VI in TBRSTC [
4], Thy4 and Thy5 in RCPath [
3], and TIR4 and TIR5 of ICCRTC [
6]. While categories VI, Thy5, and TIR5 are expected to be associated with a near-to-100% cancer prevalence at histology, the estimated risk of cancer of the classes V, Thy4, and TIR4 may vary between 50 to 75% [
4], 68 to 70% [
3], and 60 to 80% [
6], respectively. Remarkably, these figures were not initially based on specific studies assessing the actual risk of malignancy, but they were estimated by the expert boards when preparing guidelines. The performance of TBRSTC was later evaluated in a systematic review recording a cancer rate of 79.6% in V and 99.1% in VI [
29], and that of RCPath in another systematic review which found 79% in Thy4 and 98% in Thy5 [
30]. Importantly, the Vuong meta-analysis [
29] reported the pooled finding of the operation rate as factor which can have a potential to influence the cancer rate. As for the Italian system, a preliminary meta-analysis on initial data was published [
7]. There, the risk of cancer of TIR4 and TIR5 was 85 and 99%, respectively. However, as above mentioned, the number of studies and their sample size were limited, and the operation rate or other influencing factors for both TIR4 and TIR5 cases were not analyzed, being unavailable in the literature. The present systematic review was then conceived to achieve higher-level evidence about the risk of cancer associated with TIR4 and TIR5. The herein adopted criteria to include studies were highly selective. In addition, the resection rate data were extracted from each study and other potential influencing factors were considered. With these premises, the figures obtained in the present meta-analysis have to be regarded as highly reliable and they can form a solid basis upon which Italian guidelines [
6] can estimate the risk of cancer associated to TIR4 and TIR5 in an updated version.
First, the herein found cancer rate in TN classified as TIR4 was 92.5%, with a fairly narrow 95%CI and a moderate inconsistency. This figure corroborates the preliminary data [
7] and questions the estimates reported in ICCRTC. Second, the pooled cancer rate among TIR5 was 99.7%, without heterogeneity. This finding confirms the preliminary one [
7] and makes the original estimates of malignancy reliable. Third, the 95%CI of cancer rate in TIR4 and TIR5 was not overlapping, meaning this that there is a significantly different risk between them. In addition, the cancer risk associated with TIR5 was significantly higher than that of TIR4 with OR 11. These features actually make TIR4 and TIR5 two distinct categories. Fourth, regarding cancer rate findings, heterogeneity was found only in TIR4 and remained not fully explained after several sub-analyses. However, since heterogeneity was found in resection rate in both TIR4 and TIR5, the performed sub-analyses could allow to partially explain the above inconsistency of cancer rate among TIR4. In fact, the mean frequency of TIR4 and TIR5 among all FNACs included in the 16 studies varied significantly (i.e., from 1.1 to 29.6% and 1.9 to 70.4%, respectively). When we analyzed the impact of these frequencies on the resection rate, we found the latter was significantly influenced by the frequency of cases both in TIR4 and in TIR5 (i.e., the higher the prevalence of TIR4/TIR5 among FNACs, the higher their operation rate). This data might suffer from a publication bias. In example, two large series included in our study [
13,
14] derive from metropolitan institutions that represent referral centers for thyroid FNAC. In these two studies there was a low operation rate of both TIR4 and TIR5 which may be due to the fact that TN patients, after FNAC, were managed elsewhere and the authors have no follow-up data. These findings mean that several factors could influence the results we read in these papers. In fact, we cannot fully know how each institution manage TN patients, how select them for FNAC, and how and when recommend surgery. In addition, the expertise of local cytopathologist remains not explored, but its influence cannot be excluded [
31]. Furthermore, one role may be hold in this context by molecular tests. Some of the included papers [
12,
16‐
19,
21‐
23,
26] used molecular tests with different combinations (i.e., BRAF as single test, BRAF combined with TERT, or different extended molecular panels). Because of this different approaches, pooled findings could not be calculated. Anyway, as suggested by these studies [
16,
17], molecular tests did not increase the diagnostic accuracy of TIR4 and TIR5 categories. Finally, the compliance of each patient and the availability of surgical facilities during pandemic could have had an impact on operation rate [
32]. Lastly, the cancer rate herein found in TIR4 (92.5%) seems to be higher than that reported in other meta-analysis in category V of TBRSTC (79.6%) [
29], and in Thy4 of RCPath (79%) [
30]. This finding merits a careful evaluation by cytopathologists to understand whether it depends on the definition of the classes of suspicious for malignancy or on other factors. One possible explanation of the cancer rate found in TIR4, also higher than that estimated in ICCRTC guidelines [
6], might be the introduction in 2014 of two subcategories of TIR3 (i.e. low-risk TIR3A and high-risk TIR3B). In fact, TIR3B “
also includes samples characterized by nuclear alterations suggestive of papillary carcinoma, which do not permit to reliably exclude malignancy, but are too mild or focal to be included in the TIR4 category” [
6]. Then, the pathologists may have been pushed to downgrade in TIR3B some cases that would have been previously classified as TIR4 [
5]. This data could be investigated in future studies.
Both limitations and strengths of the present systematic review have to be discussed. First, basically all papers included retrospective series of TN patients managed in several institutions according to local management rules. Some concerns may then be present about selection bias. Second, all studies retrieved with the present systematic were from Italy, as largely expected. Then, these data are reliable as they derive from institutes that use ICCRTC in their clinical routine. Third, the resection rate of both TIR4 and TIR5 was very high. Being this data in line with the indication contained in the ICCRTC guidelines, this can represent a proof of good practice followed in the institutions involved in the 16 studies.
In conclusion, the actual risk of malignancy of TIR4 and TIR5 of ICCRTC is 92.5 and 99.7%, respectively. These figures can form the basis for the next updated version of ICCRTC. Any institution using ICCRTC is asked to revise its series of TIR4/TIR5, calculate their cancer rate among operated cases, and, importantly, consider all the modifiers of the risk of malignancy, e.g., clinical management, percentage of TIR4/TIR5 among the overall series of FNACs, and resection rate. Ideally, a cross-check among institutions should be considered.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.