河北大学学报(自然科学版) ›› 2009, Vol. 29 ›› Issue (6): 653-657.DOI: 10.3969/j.issn.1000-1565.2009.06.022

• • 上一篇    下一篇

一种改进的迭代样本挑选算法

刘春荣,吴博   

  1. 河北大学,数学与计算机学院,河北,保定,071002
  • 出版日期:2009-11-25 发布日期:2009-11-25
  • 基金资助:
    河北省教育厅科学研究计划支持项目

A New Iterative Algorithm Based for Sample Selection

LIU Chun-rong,WU Bo   

  • Online:2009-11-25 Published:2009-11-25

摘要: 针对近邻法分类需要大量计算和存储的缺点,提出了一种改进的样本挑选算法(different iterative case filtering,DICF).该算法首先评价每个样本的分类能力,据此不断删除分类能力弱的样本,迭代执行此过程,直到压缩子集不再变小为止. 经分析得出DICF算法时间复杂度为O(n~2). 在真实数据库上的实验结果表明,通过DICF算法得到的压缩集在压缩比、分类精度上均优于MCS,ICF, ENN等经典算法.

关键词: 近邻法, iterative case filtering, minimal consistent set

Abstract: To overcome the drawbacks that nearest neighbour classification requires huge computation and memory storage, this paper proposes a improved algorithm (different iterative case filtering,DICF). In this algorithm, a set of samples with the poor classification ability is removed from the training set in each iteration until the training set is no longer getting smaller. It can be seen from analysis that time complexity of DICF is O (n~2). The experimental results on some real data sets demonstrate the effectiveness and the feasibility of the proposed algorithm. Compared to traditional methods, such as MCS,ICF and ENN, the condensed sets obtained by DICF is superior in storage and classification accuracy.

Key words: nearest neighbor, iterative case filtering, minimal consistent set

中图分类号: