Journal of Hebei University (Natural Science Edition) ›› 2018, Vol. 38 ›› Issue (5): 549-554.DOI: 10.3969/j.issn.1000-1565.2018.05.016

Previous Articles     Next Articles

Research on archives text classification based on Naive Bayes

LIU Peixin,YU Hongzhi,XU Tao   

  1. Key Laboratory of China's Ethnic Languages and Information Technology, Northwest Minzu University, Lanzhou 730030, China
  • Received:2017-06-05 Online:2018-09-25 Published:2018-09-25

Abstract: This paper analyzes the data resources of archives in Gansu Province by combining with Naive Bayesian classification algorithm to realize the application of archives resource classification. According to the characteristics of the file data, the attribute that matches the text of the file text was selected, and the TFIDF algorithm in the file text feature attribute selection was used. The experimental results show that the classification model is suitable for the classification of archival text resources, and the function of automatic classification of archives is realized. Compared with the traditional Naive Bayesian classification method, the classification model proposed in this paper is 1%—2% for the classification efficiency of archives, it is thus a more effective classification model for the archives.

Key words: archives text resource, file feature, text classification, Naive Bayesian classification

CLC Number: