Journal of Hebei University (Natural Science Edition) ›› 2016, Vol. 36 ›› Issue (6): 650-656.DOI: 10.3969/j.issn.1000-1565.2016.06.013

Previous Articles     Next Articles

Experimental comparison of two acceleration approaches for K-nearest neighbors

ZHAI Junhai,WANG Tingting,ZHANG Mingyang,WANG Yaoda,LIU Mingming   

  1. College of Mathematics and Information Science, Hebei University, Baoding 071002, China
  • Received:2016-07-11 Online:2016-11-25 Published:2016-11-25

Abstract: K-NN(K-nearest neighbors)is a famous data mining algorithm with wide range of applications.The idea of K-NN is simple and it is easy to implement. Both computational time and space complexity of K-NN are all O(n),where,n is the number of instances in a training set.When K-NN encountered larger training sets,especially faced with big data sets,the efficiency of K-NN becomes very low,even K-NN is impracticable.Two acceleration approaches for K-nearest neighbors are experimentally compared on 8 data sets.The two acceleration approaches are the CNN and MapReduce based K-NN.Specifically,in Hadoop environment,this paper implements K-NN with MapReduce,and experimentally compares with CNN on 8 data sets. Some valuable conclusions are obtained,and may be useful for researchers in related fields.

Key words: K-nearest neighbors, data mining, MapReduce, Hadoop

CLC Number: