河北大学学报(自然科学版) ›› 2024, Vol. 44 ›› Issue (4): 441-448.DOI: 10.3969/j.issn.1000-1565.2024.04.013

• • 上一篇    

基于自反馈阈值学习的半监督皮肤癌诊断模型

韩硕1,袁伟珵1,杜泽宇2   

  • 收稿日期:2024-01-08 出版日期:2024-07-25 发布日期:2024-07-12
  • 作者简介:韩硕(1974—),男,河北医科大学讲师,博士,主要从事肿瘤及神经退行性疾病相关的基础研究及人工智能医疗应用.E-mail:hanshuo@hebmu.edu.cn
  • 基金资助:
    河北省自然科学基金资助项目(H2019206316)

Semi-supervised skin cancer diagnosis based on self-feedback threshold learning

HAN Shuo1, YUAN Weicheng1, DU Zeyu2   

  1. 1.College of Basic Medicine, Hebei Medical University, Shijiazhuang 050017, China; 2.School of Health Science, University of Manchester, Manchester M139PL, UK
  • Received:2024-01-08 Online:2024-07-25 Published:2024-07-12

摘要: 为解决监督学习皮肤癌诊断模型的训练需要大量数据标注,且医学专家标注工作成本高、耗时长、易疲劳等问题,提出了一种基于自反馈阈值学习(Self-Feedback Threshold Learning, SFTL)的半监督皮肤癌诊断方法.在标注数据预训练的ResNet网络基础上,引入全局和局部类别间伪标签自反馈阈值学习机制动态筛选ResNet预测概率大于自反馈阈值的无标记样本,引入无监督阈值学习损失和分类交叉熵损失进行模型训练,在标记样本稀缺的情况下深入挖掘无标记数据的鉴别诊断信息,显著降低模型在无标记皮肤病变图像中的误判率.选取公开数据集HAM10000的皮肤病变图像展开实验验证,在仅需50%标记数据下实现了0.822 9的准确率和0.765 1的F1分数,证明所提出的SFTL模型在半监督场景下可有效解决皮肤癌诊断任务,相比其他同类方法具有更好的分类性能.

关键词: 半监督皮肤癌诊断, 自反馈阈值学习, 卷积神经网络, 半监督学习

Abstract: To address the challenges associated with the need for a large amount of annotated data in supervised skin cancer diagnosis models, such as the high cost, time consumption, and fatigue experienced by medical experts during annotation, this study proposes a semi-supervised skin cancer diagnosis method based on Self-Feedback Threshold Learning(SFTL). Building upon the ResNet network pre-trained with labeled data, a global and local class pseudo-label self-feedback threshold learning mechanism is introduced to dynamically select unlabeled samples with ResNet prediction probabilities exceeding the self-feedback threshold. Unsupervised threshold learning loss and classification cross-entropy loss are incorporated for model training, thereby deeply mining the diagnostic information from unlabeled data when labeled samples are scarce and significantly reducing the misdiagnosis rate in unlabeled skin lesion images. Experimental validation was conducted using the publicly available HAM10000 skin lesion dataset, achieving an accuracy of 0.8229 and an F1 score of 0.7651 with only 50% of the data labeled. The results demonstrate that the proposed SFTL model effectively addresses the skin cancer diagnosis task in semi-supervised scenarios and outperforms other compared methods in terms of classification performance.- DOI:10.3969/j.issn.1000-1565.2024.04.013基于自反馈阈值学习的半监督皮肤癌诊断模型韩硕1,袁伟珵1,杜泽宇2(1.河北医科大学 基础医学院,河北 石家庄 050017;2.曼彻斯特大学 健康科学学院,英格兰 曼彻斯特 M139PL)摘 要:为解决监督学习皮肤癌诊断模型的训练需要大量数据标注,且医学专家标注工作成本高、耗时长、易疲劳等问题,提出了一种基于自反馈阈值学习(Self-Feedback Threshold Learning, SFTL)的半监督皮肤癌诊断方法.在标注数据预训练的ResNet网络基础上,引入全局和局部类别间伪标签自反馈阈值学习机制动态筛选ResNet预测概率大于自反馈阈值的无标记样本,引入无监督阈值学习损失和分类交叉熵损失进行模型训练,在标记样本稀缺的情况下深入挖掘无标记数据的鉴别诊断信息,显著降低模型在无标记皮肤病变图像中的误判率.选取公开数据集HAM10000的皮肤病变图像展开实验验证,在仅需50%标记数据下实现了0.822 9的准确率和0.765 1的F1分数,证明所提出的SFTL模型在半监督场景下可有效解决皮肤癌诊断任务,相比其他同类方法具有更好的分类性能.关键词:半监督皮肤癌诊断;自反馈阈值学习;卷积神经网络;半监督学习中图分类号:U492.2;TP301.6 文献标志码:A 文章编号:1000-1565(2024)04-0441-08Semi-supervised skin cancer diagnosis based on self-feedback threshold learning HAN Shuo1, YUAN Weicheng1, DU Zeyu2(1.College of Basic Medicine, Hebei Medical University, Shijiazhuang 050017, China;2.School of Health Science, University of Manchester, Manchester M139PL, UK)Abstract: To address the challenges associated with the need for a large amount of annotated data in supervised skin cancer diagnosis models, such as the high cost, time consumption, and fatigue experienced by medical experts during annotation, this study proposes a semi-supervised skin cancer diagnosis method based on Self-Feedback Threshold Learning(SFTL). Building upon the ResNet network pre-trained with labeled data, a global and local class pseudo-label self-feedback threshold learning mechanism is introduced to dynamically select unlabeled samples with ResNet prediction probabilities exceeding the self-feedback threshold. Unsupervised threshold learning loss and classification cross-entropy loss are incorporated for model training, thereby deeply mining the diagnostic information from unlabeled data when labeled samples are scarce and significantly reducing the misdiagnosis rate in unlabeled skin lesion images. Experimental validation was conducted using the publicly available HAM10000 skin lesion dataset, achieving an accuracy of 0.8229 and an F1 score of 0.7651 with only 50% of the data labeled. The results demonstrate that the proposed SFTL model effectively addresses the skin cancer diagnosis task in semi-supervised scenarios and outperforms other compared methods in terms of classification performance.- 收稿日期:2024-01-08;修回日期:2024-05-23 基金项目:河北省自然科学基金资助项目(H2019206316) 第一作者:韩硕(1974—),男,河北医科大学讲师,博士,主要从事肿瘤及神经退行性疾病相关的基础研究及人工智能医疗应用.E-mail:hanshuo@hebmu.edu.cn第4期韩硕等:基于自反馈阈值学习的半监督皮肤癌诊断模型河北大学学报(自然科学版) 第44卷

Key words: semi-supervised skin cancer diagnosis, self-feedback threshold learning, convolutional neural network, semi-supervised learning

中图分类号: