Background: This study investigated the diagnostic performance of biopsy criteria in four society ultrasonography risk stratification systems (RSSs) for thyroid nodules, including the 2021 Korean (K)-Thyroid Imaging Reporting and Data System (TIRADS). Methods: The Ovid-MEDLINE, Embase, Cochrane, and KoreaMed databases were searched and a manual search was conducted to identify original articles investigating the diagnostic performance of biopsy criteria for thyroid nodules (≥1 cm) in four widely used society RSSs. Results: Eleven articles were included. The pooled sensitivity and specificity were 82% (95% confidence interval [CI], 74% to 87%) and 60% (95% CI, 52% to 67%) for the American College of Radiology (ACR)-TIRADS, 89% (95% CI, 85% to 93%) and 34% (95% CI, 26% to 42%) for the American Thyroid Association (ATA) system, 88% (95% CI, 81% to 92%) and 42% (95% CI, 22% to 67%) for the European (EU)-TIRADS, and 96% (95% CI, 94% to 97%) and 21% (95% CI, 17% to 25%) for the 2016 K-TIRADS. The sensitivity and specificity were 76% (95% CI, 74% to 79%) and 50% (95% CI, 49% to 52%) for the 2021 K-TIRADS1.5 (1.5-cm size cut-off for intermediate-suspicion nodules). The pooled unnecessary biopsy rates of the ACR-TIRADS, ATA system, EU-TIRADS, and 2016 K-TIRADS were 41% (95% CI, 32% to 49%), 65% (95% CI, 56% to 74%), 68% (95% CI, 60% to 75%), and 79% (95% CI, 74% to 83%), respectively. The unnecessary biopsy rate was 50% (95% CI, 47% to 53%) for the 2021 K-TIRADS1.5. Conclusion: The unnecessary biopsy rate of the 2021 K-TIRADS1.5 was substantially lower than that of the 2016 K-TIRADS and comparable to that of the ACR-TIRADS. The 2021 K-TIRADS may help reduce potential harm due to unnecessary biopsies.