河北大学学报(自然科学版) ›› 2020, Vol. 40 ›› Issue (1): 77-86.DOI: 10.3969/j.issn.1000-1565.2020.01.012

• • 上一篇    下一篇

一种改进的协同过滤推荐算法

李昆仑,戎静月,苏华仃   

  • 收稿日期:2019-07-03 出版日期:2020-01-25 发布日期:2020-01-25
  • 通讯作者: 戎静月(1993—),女,河北石家庄人,河北大学在读硕士研究生.主要从事推荐算法、机器学习、数据分析等方向研究. E-mail:rongjingyueRRR@163.com
  • 作者简介:李昆仑(1962—),男,河北保定人,河北大学教授,博士,主要从事模式识别、图像处理、计算机网络、智能信息处理等方向研究. E-mail:likunlun@hbu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61672205)

An improved collaborative filtering recommendation algorithm

LI Kunlun, RONG Jingyue,SU Huading   

  1. College of Electronic Information Engineering, Hebei University, Baoding 071000, China
  • Received:2019-07-03 Online:2020-01-25 Published:2020-01-25

摘要: 协同过滤推荐算法是目前个性化推荐系统中应用比较广泛的一种算法,但也同样面临着数据稀疏性、冷启动、可扩展性等问题.本文主要针对数据稀疏性问题和冷启动问题导致的推荐效果不精确,提出了一种改进的数据填充方式和相似度计算方法.首先根据用户评分习惯对用户进行层次聚类,其次利用用户基本信息如年龄初步计算用户之间的相似度,并将共同评分项所占比值作为权重得到用户相似度,最后利用Slope-one算法计算前K个相似用户的填充值,加入相似度的权重以获得最终填充值.计算相似度寻找近邻集时,将用户基本属性作为相似度权重,并且引入Sigmoid函数来添加时间戳对相似度的影响,并得到最终的相似度计算方法. 实验结果表明,推荐精度得到了显著提高,数据稀疏性问题和冷启动问题得到了改善.

关键词: 协同过滤, 数据稀疏性, 相似度, Sigmoid, 评分尺度

Abstract: The collaborative filtering recommendation algorithm is one of the most widely used algorithms in the personalized recommendation system, but it also faces problems such as data sparsity, cold start, and scalability. This paper mainly proposes an improved data filling method and similarity calculation method for the inaccurate recommendation effect caused by the data sparsity problem and the cold start problem. Firstly, the user is hierarchically clustered according to the user's scoring habits, and then the users basic information such as age is used to calculate the similarity between users, and the ratio of the common scoring items is used as the weight to obtain the user similarity. Finally, the Slope-one algorithm is used to calculate the padding values of the first K similar users, and the similarity weights are added to obtain the final padding value. When calculating similarity to find the nearest neighbor set,the basic attribute of the user is used as the similarity weight,and the Sigmoid function is introduced to add the impact of the timestamp on the similarity and obtain the final similarity calculation method The experimental results show that the recommendation accuracy is significantly improved, and at the same time the data sparsity problem and the cold start problem are improved.

Key words: collaborative filtering, data sparsity, similarity, Sigmoid, scoring scale

中图分类号: