河北大学学报(自然科学版) ›› 2019, Vol. 39 ›› Issue (5): 536-546.DOI: 10.3969/j.issn.1000-1565.2019.05.015

• • 上一篇    下一篇

带有方差减小的加权零阶随机梯度下降算法

鲁淑霞1,张罗幻1, 蔡莲香1,孙丽丽2   

  • 收稿日期:2019-01-06 出版日期:2019-09-25 发布日期:2019-09-25
  • 作者简介:鲁淑霞(1966—),女,河北保定人,河北大学教授,博士,主要从事机器学习方向研究. E-mail:cmclusx@126.com
  • 基金资助:
    河北省自然科学基金资助项目(F2015201185)

Weighted zeroth-order stochastic gradient descent algorithm with variance reduction

LU Shuxia1, ZHANG Luohuan1, CAI Lianxiang1, SUN Lili2   

  1. 1.Key Laboratory of Machine Learning and Computational Intelligence of Hebei Province, College of Mathematics and Information Science, Hebei University, Baoding 071002, China; 2.Hebei Education Examinations Authority, Shijiazhuang 050091, China
  • Received:2019-01-06 Online:2019-09-25 Published:2019-09-25

摘要: 随机梯度下降(stochastic gradient descent,SGD)算法是机器学习问题中的高效求解方法之一.但是,对于非平衡数据,传统的随机梯度下降算法,在训练时多数类点被抽到的概率远大于少数类点,易导致计算不平衡;对于目标函数不可导或不易求导的问题,计算代价太大或无法进行计算;在每次迭代中利用单个样本梯度近似代替全梯度,这必然会产生方差,严重影响算法的分类性能.针对上述问题,提出了带有方差减小的加权零阶随机梯度下降算法,考虑了数据的间隔分布情况,在目标函数中引入了间隔均值项,并对多数类样例赋予了较小的权值,对少数类样例赋予较大的权值.在对优化问题的求解中,采用零阶优化的方法对梯度进行估计,并且引入了方差减小策略.在一些非平衡数据集上的实验验证了所提算法的有效性,并有效解决了上述问题.

关键词: 零阶优化, 随机梯度下降, 方差减小, 非平衡数据, 支持向量机

Abstract: Stochastic gradient descent(SGD)is one of the most efficient methods for solving machine learning problems.However, for imbalanced data, the probability of most class points being extracted is much higher than that of a few class points, which easily leads to unbalanced calculation.When it is non-differentiable or not easy to take the derivative on the objective function, the calculation cost is too large or cannot be calculated.A single sample gradient is used to approximately replace the full gradient in each iteration, which will inevitably produce variance.This seriously affects the classification performance of the algorithm.In order to solve the above problems, the weighted zeroth-order stochastic gradient descent algorithm with variance reduction is presented.Considering the margin distribution of data, the margin mean term is introduced into the objective function, which gives smaller weights to most class of samples, and a larger weight to a few classes of samples.In solving the optimization problem, the zeroth-order optimization method is used to estimate the gradient and the variance reduction strategy is introduced.Experiments- DOI:10.3969/j.issn.1000-1565.2019.05.015带有方差减小的加权零阶随机梯度下降算法鲁淑霞1,张罗幻1, 蔡莲香1,孙丽丽2(1.河北大学 数学与信息科学学院,河北省机器学习与计算机智能重点实验室,河北 保定 071002;2.河北省教育考试院,河北 石家庄050091)摘 要:随机梯度下降(stochastic gradient descent,SGD)算法是机器学习问题中的高效求解方法之一.但是,对于非平衡数据,传统的随机梯度下降算法,在训练时多数类点被抽到的概率远大于少数类点,易导致计算不平衡;对于目标函数不可导或不易求导的问题,计算代价太大或无法进行计算;在每次迭代中利用单个样本梯度近似代替全梯度,这必然会产生方差,严重影响算法的分类性能.针对上述问题,提出了带有方差减小的加权零阶随机梯度下降算法,考虑了数据的间隔分布情况,在目标函数中引入了间隔均值项,并对多数类样例赋予了较小的权值,对少数类样例赋予较大的权值.在对优化问题的求解中,采用零阶优化的方法对梯度进行估计,并且引入了方差减小策略.在一些非平衡数据集上的实验验证了所提算法的有效性,并有效解决了上述问题.关键词:零阶优化;随机梯度下降;方差减小;非平衡数据;支持向量机中图分类号:TP181 文献标志码:A 文章编号:1000-1565(2019)05-0536-11Weighted zeroth-order stochastic gradient descentalgorithm with variance reductionLU Shuxia1, ZHANG Luohuan1, CAI Lianxiang1, SUN Lili2(1.Key Laboratory of Machine Learning and Computational Intelligence of Hebei Province, College of Mathematics and Information Science, Hebei University, Baoding 071002,China; 2.Hebei Education Examinations Authority, Shijiazhuang 050091,China)Abstract: Stochastic gradient descent(SGD)is one of the most efficient methods for solving machine learning problems.However, for imbalanced data, the probability of most class points being extracted is much higher than that of a few class points, which easily leads to unbalanced calculation.When it is non-differentiable or not easy to take the derivative on the objective function, the calculation cost is too large or cannot be calculated.A single sample gradient is used to approximately replace the full gradient in each iteration, which will inevitably produce variance.This seriously affects the classification performance of the algorithm.In order to solve the above problems, the weighted zeroth-order stochastic gradient descent algorithm with variance reduction is presented.Considering the margin distribution of data, the margin mean term is introduced into the objective function, which gives smaller weights to most class of samples, and a larger weight to a few classes of samples.In solving the optimization problem, the zeroth-order optimization method is used to estimate the gradient and the variance reduction strategy is introduced.Experiments- 收稿日期:2019-01-06 基金项目:河北省自然科学基金资助项目(F2015201185) 第一作者:鲁淑霞(1966—),女,河北保定人,河北大学教授,博士,主要从事机器学习方向研究. E-mail:cmclusx@126.com第5期鲁淑霞等:带有方差减小的加权零阶随机梯度下降算法on some imbalanced datasets demonstrate the effectiveness of the proposed algorithm which effectively solve the above problems.

Key words: zeroth-order optimization, stochastic gradient descent, variance reduction, imbalanced data set, support vector machine

中图分类号: