Journal of Hebei University (Natural Science Edition) ›› 2017, Vol. 37 ›› Issue (1): 108-112.DOI: 10.3969/j.issn.1000-1565.2017.01.016

Previous Articles    

Research on document similarity based on terms synonymous relationship

ZHANG Xizhong1,XU Jianmin2   

  1. 1.Institute of Information Technology, Baoding Education Examinations Authority, Baoding 071000, China; 2.School of Computer Science and Technology, Hebei University, Baoding 071002, China
  • Received:2016-10-10 Online:2017-01-25 Published:2017-01-25

Abstract: Because vector space model(VSM)assumes that terms in different documents is orthogonal,when different documents are described by different terms,VSM can’t accurately reflect the similarity between them.For this problem,based on giving definition and computing method of similarity between two terms set,this paper gives a quantification method to calculate similarity between two documents.Our experiments adopt science and technology literature documents and news stories to test the classification accuracy of VSM and the new method,results indicate that the new method can improve classification accuracy.

Key words: synonymous, similarity between two terms, similarity between two documents

CLC Number: