The predictive value of sonographic images of follicular lesions – a comparison with nodules unequivocal in FNA – single centre prospective study

The aim of the study was the analysis of efficacy of UMRFs evaluation in patients with FL. Data published on this issue is less concordant than that referring to the usefulness of UMRFs analysis in the whole group of thyroid nodules. However FL nodules are special, and in routine cytological examinations there is no possibility to differentiate between malignant and benign nodules, especially between FTC and adenoma [11, 20]. Such a group of nodules raises expectations from ultrasound imaging. There is a continuous search for features that could aid in making clinical decisions and avoiding unnecessary surgical treatment. This is of particular importance in populations similar to our own, characterized with a low risk of malignancy in FL nodules and relatively high percentage of FTCs among malignant tumours. In our study, FTCs amounted to 46.2% of malignant neoplasms in the FL group. The percentage of PTCs among malignant nodules in FL group was nearly 40% lower than in the group of unequivocal diagnosis of MN. Among histopathologically benign nodules, the incidence of follicular adenomas was 3 times higher in the FL group than in the nodules with unequivocal cytological diagnosis of BL.

Unfortunately, our data indicate that the evaluation of UMRFs in such FLs is less effective, and its effectiveness decreases parallel to the decrease in percentage of PTCs among malignant neoplasms and to the increase of the percentage of adenomas among benign nodules. In the group of UC nodules all the examined UMRFs were found more often in cancers than in benign nodules. Similar results were shown in the metaanalysis by Remonti et al. [8]. On the other hand, in the group of FL the features observed more often in cancers than in benign lesions included only calcifications of any type and macrocalcifications without accompanying microcalcifications. In the FLUS subgroup, in which PTCs constituted about 70% of all malignant neoplasms, other UMRFs also occurred slightly more often in cancers. In contrast, in the SFN subgroup where the percentage of PTC only slightly exceeded 30% and adenomas constituted above 25% of benign nodules, the relation was reversed: hypoechogenicity, suspected margins, and – to a lesser degree – hypoechogenicity of solid nodule and pathological vascularisation were observed slightly more often in benign nodules.

Our study suggests that the differences in the effectiveness of UMRFs assessment between the FL and UC groups are the consequence of different ultrasound image of both cancers and benign nodules between these groups. Cancers in the FL group had suspected margins less often than cancers in the UC group. Additionally, cancers in the SFN subgroup were less frequently hypoechoic than cancers in the UC group. On the other hand, benign nodules diagnosed cytologically as FLUS or SFN were more often hypoechoic, solid as well as pathologically vascularized, and had suspected margins less often than their counterparts with an unequivocal BL diagnosis. Other authors also reported the differences between ultrasound images of FTCs and PTCs. Jeh et al. [22] showed that FTCs usually had regular margins, were less frequently hypoechoic and solid than PTCs, less often presented suspicious shape or margins, and were not characterized by microcalcifications. The follicular variant of PTC, which more often corresponded to malignant nodules in FL group than to those in MN group, causes diagnostic difficulties, as it shows a higher rate of follicular-like features than the conventional variant [28, 29]. Thus, the obtained results cannot be surprising if one considers the fact that UMRFs have been established mainly on the basis of ultrasound image of the most common PTCs and are optimized for revealing that type of thyroid cancer [27, 30].

Many reports on the usefulness of the UMRFs assessment, both in unselected thyroid nodules and in FL nodules, and particularly in nodules of the category III in the Bethesda system, come from the countries with very high iodine supply (e.g. South Korea). In such areas PTCs dominate not only in the group of nodules with unequivocal cytological diagnosis of MN, but also among FL nodules, and especially in nodules of the category III in the Bethesda system [6, 1416, 19]. The percentage of PTCs in this category is further increased by frequent classification into this category of the nodules with borderline cytological result, when characteristic features of benign and malignant lesions coexist in a smear and the cellularity of the aspirate is scant. Consequently, at some centres the frequency of formulating FNA diagnoses of the category III reaches up to 20% (instead of assumed 5–7%). Additionally, the malignancy risk related to this category approaches 50% (instead of assumed 5–15%) [3135], and the percentage of PTCs among malignant neoplasms in this category is over 90% [6, 1416, 19]. In such centres, the effectiveness of the UMRFs assessment in the category III of FNA results is high. In the study by Jeong et al. [14] diagnostic usefulness of evaluating taller-than-wide shape, ill-defined margins, and microcalcifications or macrocalcifications was shown in the Bethesda category III nodules. Yoo et al. [19] showed that malignancy in the nodules of that category was associated with taller-than-wide shape, ill-defined margins and marked hypoechogenicity, while Gweon et al. [15] reported that it was related to marked hypoechogenicity, microlobulated or irregular margins, microcalcifications, and taller-than-wide shape. Kim et al. [16] showed that the presence of several (1) UMRFs of the following: marked hypoechogenicity, a spiculated margin, microcalcifications, and a taller-than-wide shape in solid thyroid Bethesda III nodules, is an indication for surgical treatment without a control FNA. In all these studies, the risk of malignancy in the nodules with the category III in the Bethesda system and suspicious ultrasound image was significantly higher when compared with cytological evaluation alone. But in none of the above mentioned studies the risk of malignancy in Bethesda III nodules was assessed with consideration for whether the examined nodules were on the borderline between benign lesions and follicular neoplasms (nodules defined as classic FLUS with atypia of cellular architecture) or on the borderline between benign nodules and cancers (nodules with nuclear atypia – atypia of undetermined significance (AUS). Many authors use this distinction and indicate that the AUS subgroup is characterized by a higher than FLUS risk of malignancy, a higher percentage of PTCs among malignant neoplasms [9, 3639] and a specific ultrasound image. Lee et al. [4] found that the AUS group more frequently had non-circumscribed margins and taller-than-wide shape than the FLUS group. The incidence of FTCs was significantly higher in the FLUS group than in the AUS group (33.3 vs.1.6%) [4]. Similarly, Choi et al. [38] found that a spiculated margin, marked hypoechogenicity, and micro- or macrocalcifications were significantly more common in AUS than in FLUS. Interestingly, in the report from Turkey, where the risk of malignancy in Bethesda III thyroid nodules is lower (22.8%) than in the above mentioned reports from South Korea, the only predictive features of malignancy were hypoechogenicity in the AUS group and peripheral vascularization in the FLUS group [2]. In the report from Brazil, with a similar risk of malignancy in Bethesda III thyroid nodules (22.6%), Rosario [9] showed that AUS presented a higher frequency of suspicious malignant US findings compared to FLUS, but evaluation of UMRFs allowed to predict malignancy both in AUS and FLUS nodules. However, in that study PTCs also constituted 91.2% of all malignant tumours.

In our study the nodules of the Bethesda category III were dominated by FLUS diagnoses. Only in 2.1% of cases the smear was classified into this category because of the presence of nuclear atypia – corresponding to AUS. It can be explained by epidemiological circumstances and a continued high incidence of non-neoplastic thyroid nodules in our patients. But it may also be the consequence of a more conservative attitude to the rules for formulation of FLUS diagnosis, which was limited to nodules from the boundary between follicular neoplasms and benign lesions. Consequently, the percentage of PTCs among malignant neoplasms was lower in our study (70%) and it could decrease the effectiveness of UMRFs evaluation.

In the case of the category IV of cytological diagnoses – SFN – the effectiveness of UMRFs analysis was even lower, as mentioned before. Recent reports seldom refer to this particular group of cytological diagnoses. This is the consequence of clinical recommendations that imply surgical treatment in such cases. Moreover, many previous reports showed that the evaluation of UMRFs was not useful in that category of cytological diagnoses [17]. Recently, Iskandar et al. [18] analysed joint groups III and IV of cytological diagnoses and found that in such a group of nodules the examined UMRFs (microcalcifications, irregular borders, hypervascularity and hypoechogenicity) were not associated with malignancy. The malignancy rate in resected thyroid nodules was 13% for Bethesda III and 28% for Bethesda IV, PTCs constituted 72% of malignant neoplasms. Park et al. [7] found that in the category IV the evaluation of UMRFs had the lowest efficiency in the comparison with other groups of cytological diagnoses (the risk of malignancy in that group was 5.7%). On the other hand, in the group of nodules of the IV diagnostic category and with 24.3% overall malignancy rate Chng et al. [1] showed the usefulness of assessing irregular margins, hypoechogenicity, and taller-than-wide shape, despite the fact that the percentage of PTCs among malignant neoplasms was below 50% in that study. However, a noticeable amount (40%) of the patients in that group were not treated surgically.

In our study only the presence of macrocalcifications or any type of calcifications (which obviously included macrocalcifications) increased the risk of malignancy in FL nodules (both FLUS and SFN) in comparison with cytological evaluation alone to the values 15% (PPV for any type of calcifications – 25.9%, and for macrocalcifications – 20.9%), which is a threshold commonly assumed in the recommendations above which the surgical treatment is advocated. Only slightly lower PPV (14.6%) was found for taller-than-wide shape. Similar PPV values were obtained for the set of UMRFs including 2 features with high SEN: hypoechogenicity, solid echostructure and 2 features with high SPC: calcifications of any type and suspected shape. That set showed nearly 90% SPC and SEN higher than the assessment of calcifications alone. Higher SEN was also observed in all the examined sets which included calcifications of any type. The sets of features which included only microcalcifications were less effective in the FL group, and the incidence of such sets was similar in benign and malignant nodules in that group. The lowest values were observed for the set proposed by the ATA and the TIRADS set. But both those sets are tailored to reveal the most common PTCs. The TIRADS set was specified by Kwak in the group of nodules with unequivocal cytological result after excluding nodules with indeterminate cytology [27] and verified against the group of cancers with a low percentage of FTCs (1.7%) and low percentage of follicular variant of PTCs (2.5%) [40]. Papillary cancers amounted to more than 95% of malignant neoplasms in that study. The authors found in such a group of thyroid nodules that the presence of macrocalcifications without accompanying microcalcifications was not useful diagnostically [27], what is in concordance with our observations in the group of nodules with unequivocal cytology.

Recently, several papers have been published on the evaluation of TIRADS in FL nodules. Yoon et al. [10] found significant differences in TIRADS category between benign and malignant nodules in the AUS subgroup, but not in the FLUS subgroup. In the group of nodules with FLUS in cytological result PTCs amounted to 57% of malignant neoplasms while in the AUS group – 100%. Park et al. [6] analysed the usefulness of TIRADS in nodules with two AUS/FLUS results and found that ultrasound features and TIRADS categories did not differ between benign and malignant nodules. Maia et al. [5] evaluated the usefulness of the TIRADS system in Bethesda categories III, IV and V with positive conclusions, but they used the modified version of the system with the addition of vascularity criteria by Doppler analysis. Also in that study, PTCs constituted nearly 90% of all malignant neoplasms. On the other hand, Chng et al. [1] showed the usefulness of TIRADS scores 4C and 5 in predicting malignancy of category IV nodules, but that study had retrospective design and nearly half of the patients did not have histological results.

In the UC group the analysis of all the sets of UMRFs increased SEN as much as several times in comparison with the evaluation of single features with low SEN, while preserving high SPC. PPV exceeded 60% for all the examined sets of features. But direct comparison of PPV between the UC and FL groups, as well as between our study and the reports from other centres is not possible. PPV of UMRFs depends on the malignancy rate of nodules in a particular diagnostic category. Thus special attention should be paid while interpreting PPV in the case of cytological category III, where the risk of malignancy ranges widely from 5 to 50% [3135].

Other factors, which make comparison of the reported results difficult, are independent of the examined population, but are related to the expertise of the person performing the examination and the type of ultrasonograph used. Qualitative and not quantitative character of UMRFs makes them more susceptible for variable interpretation by ultrasonographers, especially when they come from different centres and work with different equipment. Another issue is the variable way of defining analysed features, e.g. solid character of a nodule is assumed when the solid part of a nodule is greater than 50% [4] or 90% [40]. In some ultrasonographs additional software is used that facilitates visualisation of some UMRFs, e.g. microcalcifications characteristic of PTCs. It should be stressed that the assessment of microcalcifications is not an easy task. Bright reflections on ultrasound imaging in spongiform nodules may be confused with microcalcifications by less proficient sonographers [11]. Thus, in our study all examinations were performed by experienced ultrasongraphers with the same equipment.

Another important advantage of our study is performing UMRFs evaluation directly prior to biopsy. Therefore, the result of FNA did not influence that evaluation. We also limited our analysis to nodules verified with postoperative histopathological examination. Such a design has both advantages (certainty of the correct diagnosis of benign and malignant lesions) and disadvantages (a bias introduced by the additional selection of nodules). But in our material the difference in the risk of malignancy in FLUS nodules, as determined by histopathological examination and cytological follow-up, is not big (3%) [25]. Our previous study also showed that there were no differences in the ultrasound image of FLUS nodules in the patients treated surgically and conservatively. Patients treated surgically were younger and had larger nodules [25]. A disadvantage of our study is the relatively low number of cancers in the FL group, but it reflects the frequency of cancers in this cytological category in our population. Undoubtedly, extension of the study for a larger group of patients would be indicated. From the methodological point of view the ideal study design would compare the effectiveness of URMF analysis between iodine sufficient and iodine deficient populations. However such an assessment would meet practical difficulties in securing uniform URMF assessment in two geographically remote study groups.