Journal of Hebei University(Natural Science Edition) ›› 2025, Vol. 45 ›› Issue (2): 192-203.DOI: 10.3969/j.issn.1000-1565.2025.02.010

Previous Articles    

Multimodal sentiment analysis method based on improved FGM-CM-BERT model

LI Renzheng1, GAO Guandong2, SONG Shengzun3, XIAO Ke1   

  1. 1. College of Information Science, Hebei Agricultural University, Baoding 071001, China; 2. Department of Information Management, The National Police University for Criminal Justice, Baoding 071000, China; 3. Department of Penology, The National Police University for Criminal Justice, Baoding 071000, China
  • Received:2024-10-12 Published:2025-03-26

Abstract: In response to the challenges of weak generalization ability and inefficient feature fusion in multimodal sentiment analysis methods using speech-text data, an improved FGM-CM-BERT model was proposed. The Fast Gradient Method(FGM)was enhanced for adversarial training to improve model generalization and a multi-head attention mechanism was employed to extract and fuse multimodal features for enhanced algorithm accuracy. Firstly, based on the features of multimodal data, an adaptive parameter adjustment strategy for the FGM weight function was introduced, using an input-data-based approach to enhance the models generalization ability at the embedding layer through adaptive perturbations. Secondly,- DOI:10.3969/j.issn.1000-1565.2025.02.010基于改进的FGM-CM-BERT模型多模态情感分析方法李仁正1,高冠东2,宋胜尊3,肖珂1(1.河北农业大学 信息科学与技术学院,河北 保定 071001;2. 中央司法警官学院 信息管理系,河北 保定 071000;3. 中央司法警官学院 监狱学学院,河北 保定 071000)摘 要:针对语音文本多模态情感分析方法中泛化能力弱和特征融合效率低的问题,提出了一种改进的FGM-CM-BERT模型,改进快速梯度法(FGM)以对抗训练提升模型泛化能力,并采用多头注意力机制提取融合多模态特征,以提升算法准确度.首先,根据多模态数据特征,通过一种基于输入数据特征的自适应参数调整策略来改进FGM权重函数,在embedding层增加自适应扰动提升模型泛化能力;其次,在跨模态交互层提出利用多头自注意力机制,通过将文本查询和音频键值对交叉融合,在特征融合效率与模型复杂度之间达到了较好的平衡;最后,实验采用CMU-MOSI和CMU-MOSEI数据集,对比了常用的15个基线模型,结果表明:该模型在七类情绪评分分类及二元情绪分类的准确率较基线模型均有所提升,分别达到了48.2%和87.5%,验证了该方法的有效性.关键词:多模态情感分析;快速梯度法;多头注意力机制;对抗训练;自适应扰动;跨模态特征融合中图分类号:TP391.1 文献标志码:A 文章编号:1000-1565(2025)02-0192-12Multimodal sentiment analysis method based on improved FGM-CM-BERT modelLI Renzheng1, GAO Guandong2, SONG Shengzun3, XIAO Ke1(1. College of Information Science,Hebei Agricultural University, Baoding 071001, China; 2. Department of Information Management,The National Police University for Criminal Justice, Baoding 071000,China; 3. Department of Penology, The National Police University for Criminal Justice, Baoding 071000, China)Abstract: In response to the challenges of weak generalization ability and inefficient feature fusion in multimodal sentiment analysis methods using speech-text data, an improved FGM-CM-BERT model was proposed. The Fast Gradient Method(FGM)was enhanced for adversarial training to improve model generalization and a multi-head attention mechanism was employed to extract and fuse multimodal features for enhanced algorithm accuracy. Firstly, based on the features of multimodal data, an adaptive parameter adjustment strategy for the FGM weight function was introduced, using an input-data-based approach to enhance the models generalization ability at the embedding layer through adaptive perturbations. Secondly,- 收稿日期:2024-10-12;修回日期:2024-12-11 基金项目:国家自然科学基金资助项目(31801782);河北省社会科学基金资助项目(HB21ZZ002) 第一作者:李仁正(1998—),男,河北农业大学在读硕士研究生,主要从事自然语言处理方向研究.E-mail:1064149655@qq.com 通信作者:高冠东(1979—),男,中央司法警官学院教授,博士,主要从事机器视觉、大数据分析技术方向研究.E-mail:gaoguandong@sina.com 第2期李仁正等:基于改进的FGM-CM-BERT模型多模态情感分析方法河北大学学报(自然科学版) 第45卷at the cross-modal interaction layer, leveraging a multi-head self-attention mechanism was proposed to a good balance is achieved between feature fusion efficiency and model complexity by cross-fusing textual queries with audio key-value pairs. Finally, experiments conducted on the CMU-MOSI and CMU-MOSEI datasets compare the model against 15 commonly used baseline models. The results demonstrated that the model achieved improved accuracy over baseline models in seven-class emotion score classification and binary emotion classification, reaching 48.2% and 87.5% respectively, validating the effectiveness of the proposed method.

Key words: multimodal sentiment analysis, fast gradient method, multi-head attention mechanisms, adversarial training, adaptive perturbation, cross-modal feature fusion

CLC Number: