Methods
Patients and samples
Patients of thyroid nodules with both CNB and matched resected specimens treated at Peking University First Hospital between January 2015 and December 2020 were reviewed. CNB was used as the first-line preoperative diagnosis in all patients without prior FNB according to publication protocol [
7]. The Peking University First Hospital Ethics Committee approved the usage of all patient samples and clinical data and an informed consent exemption (ethical approval no.: (2018) Research No. 147).
Pathological review
All hematoxylin and eosin (HE) staining slides were separately reviewed by two pathologists blinded to the original diagnoses. The CNB samples were diagnosed according to the Korean proposal: (I) nondiagnostic or unsatisfactory; (II) benign lesion; (III) indeterminate lesion; (IV) follicular neoplasm; (V) suspicious for malignancy; and (VI) malignant (Table
1) [
8]. Cases classified as III–V were retrieved as “uncertain.” The resected samples were diagnosed according to the 2017 WHO classification of tumors of endocrine organs (4th): conventional papillary thyroid carcinoma (CPTC), follicular variant papillary thyroid carcinoma (FVPTC), follicular thyroid carcinoma (FTC), follicular adenoma (FA), nodular hyperplasia (NH), and thyroiditis [
9]. The cases with inconsistent diagnoses were reviewed, and agreements were achieved by discussion. Furthermore, we divided the cohort into two groups, i.e., the follicular neoplasm (FN) and the non-follicular-neoplasm-lesion (non-FN-lesion), to see if the biomarkers’ efficiency was different. The FN included FTC and FA. The non-FN-lesion included CPTC, FVPTC, NH, and thyroiditis.
Table 1
Diagnostic categories of thyroid core needle biopsy proposed by the Korean Thyroid Association [
5]
I. Nondiagnostic or unsatisfactory |
• Non-tumor adjacent thyroid tissue only |
• Extrathyroid tissue only (e.g., skeletal muscle, mature adipose tissue) |
• Acellular specimen (e.g., acellular fibrotic tissue, acellular hyalinized tissue, cystic fluid only) |
• Blood clot only |
• Other |
II. Benign lesion |
• Benign follicular nodule |
• Hashimoto’s thyroiditis |
• Subacute granulomatous thyroiditis |
• Nonthyroidal lesion (e.g., parathyroid lesions, benign neurogenic tumors, benign lymph node) |
• Other |
III. Indeterminate lesion |
IIIa. Indeterminate follicular lesion with nuclear atypia |
IIIb. Indeterminate follicular lesion with architectural atypia |
IIIc. Indeterminate follicular lesion with nuclear and architectural atypia |
IIId. Indeterminate follicular lesion with Hürthle cell changes |
IIIe. Indeterminate lesion, not otherwise specified |
IV. Follicular neoplasm |
IVa. Follicular neoplasm, conventional type |
IVb. Follicular neoplasm with nuclear atypia |
IVc. Hürthle cell neoplasm |
IVd. Follicular neoplasm, not otherwise specified |
V. Suspicious for malignancy |
• Suspicious for papillary carcinoma, medullary carcinoma, poorly differentiated carcinoma, metastatic carcinoma, lymphoma, etc |
VI. Malignant |
• Papillary thyroid carcinoma, poorly differentiated carcinoma, anaplastic thyroid carcinoma, medullary thyroid carcinoma, lymphoma, metastatic carcinoma, etc |
Immunohistochemistry stain
The primary antibodies included antibodies against CK19 (Dako, Clone RCK108), galectin-3 (Invitrogen, A3A12), HBME-1 (Dako, Clone HBME-1), and CD56 (Dako, Clone 123C3). The antigen retrieval buffer was EDTA (pH 9.0), the temperature was 98 °C, and the duration was 20 min. We used EnVision FLEX + Mouse LINKER to amplify the signal, the EnVision FLEX Mini Kit to visualize the immunohistochemistry (IHC) reaction, and the Autostainer Link 48 (Agilent Technologies, Santa Clara, CA, USA) to complete the procedure. The normal thyroid follicles around the nodules were the best IHC staining and evaluation controls for CD56. For CK19, galectin-3, and HBME-1, the known positive samples were put side by side with the target samples on each slide as controls.
Scoring the results of a single IHC biomarker
Tumors with membranous ± cytoplasmic reactivity for CK19 in more than 10% of cells with strong intensity were considered positive. Tumors with cytoplasmic + nuclear reactivity for galectin-3 and membranous reactivity for HBME-1 or CD56 in more than 10% of cells were deemed positive regardless of intensity [
10].
Integrating IHC markers
The cohort positive of integrated IHC consisted of two groups: The first was CD56 negative no matter whether CK19, galectin-3, and HBME-1 were stained or not; The second was CD56 positive and the other markers simultaneously positive. The cutoff of simultaneously positive markers was different in the different panels. The first panel, named IHC-COMB1, required all three simultaneously positive; the second panel, named IHC-COMB2, required at least two, and the third panel, named IHC-COMB3, required at least one.
Next-generation sequencing
The percentage of tumor components in the CNB samples was recorded. Genomic DNA was extracted from unstained 5-µm-thick paraffin-embedded sections using the QIAamp DNA FFPE Tissue Kit (Qiagen, Hilden, Germany) following the manufacturer’s instructions. After extraction, DNA quality was evaluated by 1% agarose gel electrophoresis. The concentration of all samples was quantitated by a NanoDrop system (Invitrogen Life Technologies, Carlsbad, CA, USA) and Qubit Fluorometer (Invitrogen Life Technologies).
Targeted next-generation sequencing (NGS) was conducted using an OncoAim® thyroid cancer multigene assay kit (Singlera Genomics, Inc., Shanghai, China) that detected 26 genes (Table
2). According to the kit protocol, 50 ng of DNA for each sample was used to generate sequencing libraries. DNA was fragmented by 5 × WGS Fragmentation Mix (Qiagen, Beverly, MA, USA). After quality control and quantification, the library product was sequenced using 150 bp paired-end runs on the NextSeq 500 platform (Illumina, Inc., San Diego, CA, USA). Sequencing data were then aligned to the reference human genome (hg19). Read mapping, quality control, variant calling, and genotyping were performed automatically using the tool kit supplied in the OncoAim® Kit (Singlera). The minimum confidence threshold for variant calling was set to 5%. Variant functional annotation was performed with the ENSEMBL Variant Effect Predictor tool.
Table 2
Genes detected by OncoAim® thyroid cancer multigene assay kit
BRAF | NM_004333 | Exon 15 | Introns 7–10 |
RET | NM_020975 | Exons 7–16 | Introns 10–11 |
NRAS | NM_002524 | Exons 2–3 | - |
KRAS | NM_033360 | Exons 2–4 | - |
HRAS | NM_176795 | Exons 2–3 | - |
AKT1 | NM_005163 | Exons 2–7, exons 9–12 | - |
ATM | NM_000051 | All exons | - |
CNNB1 | NM_001904 | All exons | - |
TSHR | NM_000369 | All exons | - |
APC | NM_000038 | All exons | - |
TTN | NM_001256850 | All exons | - |
TG | NM_003235 | All exons | - |
RB1 | NM_000321 | All exons | - |
MEN1 | NM_000244 | All exons | - |
PDGFRA | NM_006206 | All exons | - |
PIK3CA | NM_006218 | All exons | - |
CDKN2A | NM_000077 | All exons | - |
EIF1AX | NM_001412 | All exons | - |
PTEN | NM_000314 | Exons 5–8 | - |
GNAS | NM_000516 | Exons 8–9 | - |
TP53 | NM_000546 | Exons 5–9 | - |
TERT | NM_198253 | Promoter (chr5:1,295,183–1,295,302) | - |
PPARG | NM_005037 | - | Intron 1 |
NTRK1 | NM_002529 | - | Intron 9, exon 12 |
NTRK3 | NM_002530 | - | Intron 13 |
ALK | NM_004304 | - | Intron 16, intron 19 |
Based on ClinVar (Version 20,280,919), the result was marked as pathogenic, likely pathogenic, uncertain significance, likely benign, benign, or inconclusive. We recorded “confirmed pathogenic” or “likely pathogenic” as NGS positive.
Integrating IHC and NGS
The cohort positive of integrated IHC-NGS consisted of two groups: The first was NGS positive no matter whether the IHC markers were stained or not; The second was NGS negative and at least one of four IHC markers positive.
Comparison between biomarkers’ results of CNB samples and classification of matched resected specimens
The results of biomarkers detected on CNB samples were compared to the classification of matched resected specimens.
Statistical analysis
We put resected samples classified as CPTC, FVPTC, and FTC into a single group as “malignant” and thyroiditis, NH, and FA into another group as “benign” for statistical analysis. Taking the resected specimens’ classification as the gold standard, the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy of each biomarker and various integrated panels for discriminating malignancy from benignity were calculated.
Discussion
Morphological changes, including nuclear score, architecture (papillary or follicular), and growth pattern (infiltrative or encapsulated), are critical for diagnosing thyroid tumors. Based on the criteria above, major cases can be diagnosed undoubtedly. However, some cases are difficult to determine based on histological morphology alone. Compared to resected specimens, the diagnoses of biopsies are more challenging. The uncertain diagnosis rate is 10–40% for FNB and 5–20% for CNB [
5]. Our comparative study between CNB and resected specimens of thyroid nodules showed that 74 of 578 cases could not be ascertained as malignant or benign based on the CNB sample’s morphology alone [
6]. The reason is that only follicles visible on CNB with atypical nuclei without normal tissue as a background make it impossible to differentiate FTC, FVPTC, and CPTC with a follicular predominant growth pattern from FA, NH, and thyroiditis. Therefore, studying the application of biomarkers in distinguishing uncertain biopsy samples is necessary.
Immunohistochemistry is the most popular ancillary technique used in pathological practice. Studies on resected specimens showed that CK19, galectin-3, HBME-1, and CD56 were very helpful in discriminating malignancy from benignity [
10‐
13]. In Dunderovic et al.’s study, the sensitivity of CK19, galectin-3, HBME-1, and CD56 was 75.41%, 88.52%, 71.31%, and 58.20%, respectively, and the specificity of CK19, galectin-3, HBME-1, and CD56 was 70.89%, 64.56%, 84.81%, and 92.41%, separately [
10]. Based on the knowledge above, it was supposed that IHC might play a role in improving the accuracy of diagnosing uncertain biopsy samples. We searched papers published in English in PubMed and found only one focusing on this topic. In this paper, Song et al. reported that the continued uncertain rate was 42.9% for FNB and 11.3% for CNB after IHC was applied [
14].
Our study showed that, taking the resected specimens’ diagnosis as the gold standard, biomarker’s efficiency in determining the uncertain CNB samples as malignant or benign was various. Besides, even the same marker had a different power between FN and non-FN-lesions. The specificity of CD56 is perfect (100%) for both FN and non-FN-lesions, but the sensitivity is low (66.13% for non-FN-lesions and 26.09% for FN). Therefore, CD56 negative is particular for “ruling in” the malignant CNB samples; however, CD56 positive should not be used as the indicator of benignity. On the contrary, galectin-3 showed high sensitivity (95.16%) for non-FN-lesions and moderate sensitivity (73.91%) for FN but low specificity (38.46% for non-FN-lesions and 66.67% for FN). Hence, galectin-3 negative could be highly suggestive of benignity for non-FN-lesions and cautiously used to support benignity for FN. Galectin-3 positive should not be used as the indicator of malignancy.
Given the limitation of a single marker, it is judicious to diagnose based on the integrated results. Considering that CD56 has perfect specificity but low sensitivity, the combination should precisely pick back the cases left out by CD56. Our study showed that keeping the CD56-negative cases in the cohort of malignancy and picking back the cases with CD56 positive and all of the other three markers simultaneously positive was a suitable strategy to balance the specificity (92.30%) and sensitivity (88.71%) for the non-FN-lesion. But for FN, none of the combined panels had apparent advantages over a single marker.
In the past 10 years, we have witnessed significant progress in the molecular pathology of thyroid carcinoma. In 2014, The Cancer Genome Atlas (TCGA) reported the comprehensive genomic characteristics of PTC. Ninety-seven percent of PTCs have unique molecular alterations, in which BRAF V600E mutations, RAS mutations, RET fusions, and TERT mutations are frequently detected, but EIF1AX mutations, ALK fusions, and NTRK1 or NTRK3 fusions are infrequent [
15]. Subsequently, the genotypes of FTC, poorly differentiated thyroid carcinoma (PDTC), and anaplastic thyroid carcinoma (ATC) have also been reported. In FTC, RAS mutations, PPARγ fusions, and TERT mutations are frequently detected, but BRAF K601E mutations and EIF1AX mutations are infrequent. In PDTC and ATC, BRAF V600E mutations, RAS mutations, TERT mutations, and TP53 mutations are frequently detected [
16‐
18]. Based on their understanding of thyroid carcinoma’s mutational profile, researchers have tried to use diverse molecular approaches to improve diagnosing uncertain biopsy samples and have presented various published results. The sensitivity and specificity of gene testing for discriminating malignancy from benignity were 63–94% and 52–99%, respectively, with FNB [
19‐
21]. Regardless of how sensitive or specific it is, applying gene testing to FNB is inconvenient in clinical practice because specialized sample collection is required at the initial procedure. Besides, the morphology of FNB samples used in the molecular test is unknown. In contrast, CNB samples are routinely stored as paraffin-embedded blocks in which DNA can readily be extracted and morphology can be reviewed at any moment. In this case, gene testing is supposed to distinguish uncertain samples more practically and effectively on CNB than FNB.
Compared to FNB, the number of publications about CNB is minimal, and only a few single mutations have been reported [
22‐
25]. In our research, uncertain CNB samples were detected by NGS using the commercial panel OncoAim®, which detected 26 genes covering the major molecular alterations of thyroid carcinoma. The sample was recorded as NGS positive when confirmed pathogenic or likely pathogenic mutations were detected. Taking the diagnosis of the resected specimens as the gold standard, NGS is highly specific (92.31%) and sensitive (90.32%) for the non-FN-lesion, and highly specific (88.89%) but low sensitive (52.17%) for the FN. In other words, NGS’s positive result suggests malignancy strongly for both non-FN and FN. But the negative result should be cautiously used as an indicator of benignity for non-FN-lesion and not be used as an indicator of benignity for FN. Taking PPV and NPV considered together, NGS’s efficiency was high for non-FN-lesion and moderate for FN.
Because NGS is not a universal technique and different laboratories may use diverse gene panels, platforms, and methods, the working power of NGS depends largely on each laboratory’s technical details. In practice, it is a suitable way for pathologists to interpret NGS results based on the knowledge integrating the literature’s reports and own lab’s data. All of the data and analysis about NGS in our research are based on the specific commercial tool OncoAim®.
In our study, there were 37 cases with NGS negative results on CNB samples. The diagnosis of their matched resected specimens was benign for 20 cases and malignant for 17 cases. The 17 malignant cases included 11 FTC cases, 5 CPTC cases, and 1 FVPTC case. Then, we detected the 17 cases’ genes on matched resected specimens and found that 11 FTC cases were really negative and the other six were false negative, including 5 CPTC cases with BRAF V600E and one FVPTC with NRAS mutation. Furthermore, the CNB slides of the six false-negative cases were reviewed, and it is shown that very few tumor components (less than 5%) are in them. In conclusion, the inherent features of gene mutations of thyroid tumors, especially follicular neoplasm, are considered the main reason for NGS’s relatively low efficiency as a benign marker. The false-negative results due to the limitation of tumor quantity in CNB samples are another factor in weakening NGS’s power to pick up benign cases, even though the influence is lower than FNB.
Fortunately, all six NGS false-negative cases were positive for CK19, galectin-3, and HBME-1 and negative for CD56 on CNB, which gave us the confidence to make a malignant diagnosis. So, IHC plays an essential role in these cases with NGS’s false-negative results due to the limitation of tumor quantity in CNB samples.
For non-FN-lesions, either IHC or NGS can work well individually. Therefore, combining them is unnecessary and not cost-effective. On the contrary, neither of them is powerful enough for FN when used separately. So, designing an integrated panel for improving the predictive value is a practical need. Considering the treatment of FN recommended by the NCCN guideline [
26], patients may benefit more from the safe “rule out” strategy than the precise “rule in” strategy. Based on this principle, we designed the integrated panel to keep the NGS positive cases in the cohort of malignancy and pick back the cases with at least one of four IHC markers positive. This panel can raise sensitivity and NPV to 100% and keep acceptable specificity (66.67%) and PPV (88.46%), which may be superior to use IHC or NGS separately. The negative FN cases are highly possible to be benign, and nodule surveillance may be recommended with a bit of worry.
Finally, although it is acknowledged that presenting the results as a risk of malignancy (ROM) than a binary fashion is more clinically valuable, such modification of ROM is currently unavailable due to limited number of cases. Hence, further research is required to explore the application of biomarkers in evaluating the ROM of uncertain samples.
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.