Weighted zeroth-order stochastic gradient descent algorithm with variance reduction
LU Shuxia1, ZHANG Luohuan1, CAI Lianxiang1, SUN Lili2
1.Key Laboratory of Machine Learning and Computational Intelligence of Hebei Province, College of Mathematics and Information Science, Hebei University, Baoding 071002, China; 2.Hebei Education Examinations Authority, Shijiazhuang 050091, China
LU Shuxia, ZHANG Luohuan, CAI Lianxiang, SUN Lili. Weighted zeroth-order stochastic gradient descent algorithm with variance reduction[J]. Journal of Hebei University (Natural Science Edition), 2019, 39(5): 536-546.
[1] CORTESC, VAPNIK V.Support vector networks[J].Machine Learning,1995,20(3): 273-297.DOI:10.1007/BF00994018. [2] SHALEV-SHWARTS S, SINGER Y.Pegasos: primal estimated sub-gradient solver for SVM[J].Mathematical Programming, 2011, 127(1): 3-30.DOI:10.1145/127349.1273598. [3] JOHNSON R, ZHANG T. Accelerating stochastic gradient descent using predictive variance reduction[Z]. The 26th International Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, United States, 2013. [4] LEI L H, JU C, CHEN J B, et al. Nonconvex finite sum optimization via SCSG methods [Z]. The 31st Conference on Neural Information Processing Systems, Long Beach, CA, USA,2017. [5] ALLEN-ZHU. Natasha: Faster non-convex optimization than SGD[ED/OL].(2017-08-29)[2018-10-10].https://arxiv.org/pdf/1708.08694.pdf. [6] LI J Y, WU C Z, WU Z Y, et al.Gradient-free method for nonsmooth distributed optimization[J].Journal of Global Optimization,2015, 61(2): 325-340.DOI:10.1007/s10898-014-0174-2. [7] NESTEROV Y.Random gradient-free minimization of convex function[J].Foundations of Computational Mathematics, 2017, 17(2): 527-566.DOI:10.1007/s10208-015-9296-2. [8] LIU L, CHENG M H,HSIEH CJ,et al.Stochastic zeroth-order optimization via variance reduction method[EB/OL].[2018-10-11].https://arxiv.org/pdf/1805.11811.pdf.2018-08-02. [9] LIU S.Zeroth-order stochastic variance reduction for nonconvex optimization[EB/OL].[2018-10-20].https://arxiv.org/pdf/1805.10367.pdf.2018-06-07. [10] GU B, HUO Z Y, HUANG H.Zeroth-order asynchronous doubly stochastic algorithm with variance reduction[EB/OL].[2018-06-12].https://arxiv.org/pdf/1612.01425.pdf.2016-12-05. [11] BAO L, CAO J, LI J T, et al.Boosted near-miss under-sampling on SVM ensembles for concept detection in large-scale imbalanceddatasets[J].Neurocomputing, 2016,172(C): 198-206.DOI:10.1016/j.neucom.2014.05.096. [12] ZHU T F, LIN Y P, LIU Y H.Synthetic minority oversampling technique for multiclass imbalance problems[J].Pattern Recognition, 2017, 72: 327-340.DOI:10.1016/j.patcog.2017.07.024. [13] 周宇航,周志华.代价敏感大间隔分布学习机[J].计算机研究与发展, 2016, 53(9): 1964-1970.DOI:10.7544/issn1000-1239.2016.20150436. [14] CHENG F Y, ZHANG J, WEN C H, et al.Large cost-sensitive margin distribution machine for imbalanced data classification[J].Neurocomputing, 2016, 114(8): 45-57.DOI:10.1016/j.neucom.2016.10.053. [15] DING S Y, MIRAZ B, LIN Z P, et al.Kernel based online learning for imbalance multiclass classification[J].Neurocomputing, 2018, 277: 139-148.DOI:10.1016/j.neucom.2017.02.102. [16] VAIRAGADE M. KEEL: A software tool to assess evolutionay algorithmms for data mining problems[EB/OL].[2018-12-20].http://www.keel.es/.2003-10-17.