河北大学学报(自然科学版) ›› 2017, Vol. 37 ›› Issue (6): 640-651.DOI: 10.3969/j.issn.1000-1565.2017.06.012
翟俊海1,张素芳2,郝璞1
收稿日期:
2017-09-09
发布日期:
2017-11-25
通讯作者:
张素芳(1966—),女,河北蠡县人,中国气象局气象干部培训学院河北分院副教授,主要从事机器学习方向研究.E-mail:mczsf@126.com
作者简介:
翟俊海(1964—),男,河北易县人,河北大学教授,博士,主要从事机器学习和数据挖掘方向研究. E-mail:mczjh@126.com
基金资助:
ZHAI Junhai1,ZHANG Sufang2,HAO Pu1
Received:
2017-09-09
Published:
2017-11-25
摘要: 深度学习是目前机器学习领域最热门的研究方向,轰动全球的AlphaGo就是用深度学习算法训练的.卷积神经网络是用深度学习算法训练的一种模型,它在计算机视觉领域应用广泛,而且获得了巨大的成功.本文的主要目的有2个:一是帮助读者深入理解卷积神经网络,包括网络结构、核心概念、操作和训练;二是对卷积神经网络的近期研究进展进行综述,重点综述了激活函数、池化、训练及应用4个方面的研究进展.另外,还对其面临的挑战和热点研究方向进行了讨论.本文将为从事相关研究的人员提供很好的帮助.
中图分类号:
翟俊海,张素芳,郝璞. 卷积神经网络及其研究进展[J]. 河北大学学报(自然科学版), 2017, 37(6): 640-651.
ZHAI Junhai,ZHANG Sufang,HAO Pu. Convolutional neural network and its research advances[J]. Journal of Hebei University (Natural Science Edition), 2017, 37(6): 640-651.
[1] MCCULLOCH W S,PITTS W.A logical calculus of the ideas immanent in nervous activity[J].Bulletin of Mathematical Biology,1943,52(4):99-115.DOI:10.1007/BF02478259.DOI:10.1007/BF02478259. [2] ROSENBLATT F.The perception: a probabilistic model for information storage and organization in the brain[J].Psychological Review,1958,65(6):386-408.DOI:10.1037/h0042519. [3] MINSKY M,PAPERT S.Perceptrons[M].Oxford: MIT Press,1969. [4] WERBOS P.Beyond regression: New tools for prediction and analysis in the behavioral sciences[D].Boston:PhD Thesis,Harvard University,1974. [5] RUMELHART D E,HINTON G E,WILLIAMS R J.Learning representations by back-propagating errors[J].Nature,1986,323(6088):533-536.DOI:10.1038/323533a0. [6] CORTES C,VAPNIK V.Support-vector networks[J].Machine Learning,1995,20(3):273-297.DOI: 10.1007/BF00994018. [7] HINTON G E,SALAKHUTDINOV R R.Reducing the dimensionality of data with neural networks[J].Science,2006,313(5786):504-507.DOI: 10.1126/science.1127647. [8] LECUN Y,BENGIO Y,HINTON G E.Deep learning[J].Nature,2015,521:436-444.DOI:10.1038/nature14539. [9] 余凯,贾磊,陈雨强,等.深度学习的昨天、今天和明天[J].计算机研究与发展,2013,50(9):1799-1804.DOI:10.7544/issn1000-1239.2013.20131180. YU K,JIA L,CHEN Y Q,et al.Deep learning: Yesterday,Today and Tomorrow [J].Journal of Computer Research and Development,2013,50(9):1799-1804.DOI:10.7544/issn1000-1239.2013.20131180.) [10] LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-based learning applied to document recognition [J].Proceedings of the IEEE,1998,86(11):2278-2324.DOI: 10.1109/5.726791. [11] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks [Z].International Conference on Neural Information Processing Systems,Lake Tahoe,Nevada,USA,2012.DOI::10.1145/3065386. [12] MAAS A L,HANNUN A Y,NG A Y.Rectifier nonlinearities improve neural network acoustic models [Z] The 30 th International Conference on Machine Learning,Atlanta,Georgia,USA,2013. [13] HE K M,ZHANG X Y,REN S Q,et al.Delving deep into rectifiers: surpassing human-level performance on imagenet classification [Z] IEEE International Conference on Computer Vision(ICCV),Santiago,Chile,2015.DOI:10.1109/ICCV.2015.123. [14] LI J C,NG W W Y,YEUNG D S,et al.Bi-firing deep neural networks[J].International Journal of Machine Learning & Cybernetics,2014,5(1):73-83.DOI:10.1007/s13042-013-0198-9. [15] GOODFELLOW I,BENGIO Y,COURVILLE A.Deep Learning[M].Massachusetts:MIT Press,2016. [16] YU D,WANG H,CHEN P,et al.Mixed pooling for convolutional neural networks [Z].The 9th international conference on rough sets and knowledge technology,Shanghai,China,2014. [17] GULCEHRE C,CHO K,PASCANU R,et al.Learned-norm pooling for deep feedforward and recurrent neural networks[Z].European Conference on Machine Learning and Knowledge Discovery in Databases,Nancy,France,2014.DOI:10.1007/978-3-662-44848-9_34. [18] ESTRACH J B,SZLAM A,LECUN Y.Signal recovery from Pooling Representations [Z].The 31st International Conference on Machine Learning,Beijing,China,2014. [19] HE K M,ZHANG X Y,REN S Q,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2015,37(9):1904-16.DOI: 10.1109/TPAMI.2015.2389824. [20] XIE L,TIAN Q,WANG M,et al.Spatial pooling of heterogeneous features for image classification[J].IEEE Transactions on Image Processing,2014,23(5):1994-2008. DOI: 10.1109/TIP.2014.2310117. [21] LEE H,KIM G,KIM H G,et al.Deep CNNs along the time axis with intermap pooling for robustness to spectral variations[J].IEEE Signal Processing Letters,2016,23(10):1310-1314.DOI: 10.1109/LSP.2016.2589962. [22] PERLAZA S M,FAWAZ N,LASAULCE S,et al.From spectrum pooling to space pooling: opportunistic interference alignment in MIMO cognitive networks[J].IEEE Transactions on Signal Processing,2010,58(7):3728-3741.DOI: 10.1109/TSP.2010.2046084. [23] WU HBB,GU X D.Towards dropout training for convolutional neural networks[J].Neural Networks,2015,71:1-10.DOI:10.1016/j.neunet.2015.07.007. [24] WANG J Z,WANG W M,WANG R G,et al.CSPS: An adaptive pooling method for image classification[J].IEEE Transactions on Multimedia,2016,18(6):1000-1010.DOI: 10.1109/TMM.2016.2544099. [25] SUN M L,SONG Z J,JIANG X H,et al.Learning pooling for convolutional neural network[J].Neurocomputing,2017,224:96-104.DOI:10.1016/j.neucom.2016.10.049. [26] 黄文坚,唐源.TensorFlow实战[M].北京: 电子工业出版社,2017. [27] 乐毅,王斌.深度学习-Caffe之经典模型详解与实战[M].北京: 电子工业出版社,2017. [28] LIN M,CHEN Q,YAN.Network in network [J/OL].[2014-03-04].https://arxiv.org/abs/1312.4400v3. [29] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale Image recognition [J/OL].[2014-09-15].http://arxiv.org/abs/1409.1556v2. [30] SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[Z].IEEE Conference on Computer Vision and Pattern Recognition(CVPR2015),Boston,MA,USA,2015.DOI:10.1109/CVPR.2015.7298594. [31] IOFFE S,SZEGEDY C.Batch normalization: accelerating deep network training by reducing internal covariate shift[J/OL].[2015-03-02].https://arxiv.org/abs/1502.03167v3. [32] SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[Z].IEEE Conference on Computer Vision and Pattern Recognition(CVPR2016),Las Vegas,NV,United States,2016.DOI:10.1109/CVPR.2016.308. [33] SZEGEDY C,IOFFE S,VANHOUCKE V,et al.Inception-v4,inception-ResNet and the impact of residual connections on learning [J/OL].[2016-08-23].https://arxiv.org/abs/1602.07261. [34] HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[Z].IEEE Conference on Computer Vision and Pattern Recognition(CVPR2016),Las Vegas,NV,United States,2016.DOI: 10.1109/CVPR.2016.90. [35] ZEILER M D,FERGUS R.Visualizing and Understanding Convolutional Networks[C] //Computer Vision-ECCV 2014,Lecture Notes in Computer Science,2014,8689:818-833.DOI:10.1007/978-3-319-10590-1_53. [36] IANDOLA F,MOSKEWICZ M,KARAYEV S,et al.DenseNet: implementing efficient ConvNet descriptor pyramids [J/OL].[2014-04-07]. https://arxiv.org/abs/1404.1869. [37] IANDOLA F N,HAN S,MOSKEWICZ M W,et al.SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[J/OL].[2016-09-04].https://arxiv.org/abs/1602.07360v4. [38] CHOPRA S,HADSELL R,LECUN Y.Learning a similarity metric discriminatively,with application to face verification[Z].IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2005),San Diego,California,2005.DOI:10.1109/CVPR.2005.202. [39] HADSELL R,CHOPRA S,LECUN Y.Dimensionality reduction by learning an invariant mapping[Z].IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR 2006),New York,2006.DOI: 10.1109/CVPR.2016.435. [40] KINGMA D P,WELLING M.Auto-encoding variational Bayes [J/OL].[2014-05-01].https://arxiv.org/abs/1312.6114. [41] HAYKIN S.神经网络与机器学习(影印版)[M].北京:机械工业出版社,2009. [42] BOTTOU L.Large-scale machine learning with stochastic gradient descent [Z].19th International Conference on Computational Statistics,Paris France,2010.DOI:10.1007/978-3-7908-2604-3_16.DOI:10.1016/j.neucom.2016.11.046. [43] LI S J,DOU Y,NIU X,et al.A fast and memory saved GPU acceleration algorithm of convolutional neural networks for target detection[J].Neurocomputing,2017,230:48-59. [44] MATHIEU M,HENAFF M,LECUN Y.Fast training of convolutional networks through PPTs [J/OL].[2014-03-06].https://arxiv.org/abs/arXiv:1312.5851v5. [45] RAJESWAR M S,SANKAR A R,BALASUBRAMANIAM V N,et al.Scaling up the training of deep CNNs for human action recognition [Z].IEEE International Parallel and Distributed Processing Symposium Workshop(IPDPSW2015),Hyderabad,INDIA,2015.DOI: 10.1109/IPDPSW.2015.93. [46] ZLATESKI A,LEE K,SEUNG H S.ZNN-A fast and scalable algorithm for training 3D convolutional networks on multi-core and many-core shared memory machines [Z].IEEE International Parallel and Distributed Processing Symposium(IPDPS2016),Chicago,IL,USA,2016.DOI:10.1109/IPDPS.2016.119. [47] MORCEL R,EZZEDDINE M,AKKARY H.FPGA-based accelerator for deep convolutional neural networks for the SPARK environment [Z].IEEE International Conference on Smart Cloud(SMART-CLOUD2016),New York,USA,2016.DOI:10.1109/SmartCloud.2016.31. [48] GIRSHICK R.Fast R-CNN [Z].2015 IEEE International Conference on Computer Vision(ICCV),Santiago, Chile,2015. [49] REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN: towards real-time object detection with region proposal networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,online first,DOI: 10.1109/TPAMI.2016.2577031. [50] GUSMÃO P P B.Fast training of convolutional neural networks via kernel recalling [J/OL].[2016-10-12].https://arxiv.org/abs/1610.03623. [51] CONG J S,XIAO B J.Minimizing computation in convolutional neural networks [Z].The 24th International Conference on Artificial Neural Networks,Hamburg,Germany,2014.DOI:10.1007/978-3-319-11179-7_36. [52] LAVIN A,GRAY S.Fast algorithms for convolutional neural networks [Z].2016 IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,Nevada,USA,2016.DOI: 10.1109/CVPR.2016.435. [53] ZHANG X Y,ZOU J H,HE K M,et al.Accelerating very deep convolutional networks for classification and detection[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,38(10):1943-1955.DOI:10.1109/TPAMI.2015.2502579. [54] KORYTKOWSKI M,STASZEWSKI P,WOLDAN P.Fast computing framework for convolutional neural networks [Z]. 2016 IEEE International Conferences on Big Data and Cloud Computing(BDCloud),Social Computing and Networking(SocialCom),Sustainable Computing and Communications(SustainCom),Atlanta,GA,USA,2016.DOI:10.1109/BDCloud-SocialCom-SustainCom.2016.28. [55] ZHENG S,VISHNU A,DING C.Accelerating Deep Learning with Shrinkage and Recall [J/OL].[2016-09-19].http://arxiv.org/abs/1605.01369v2. [56] KIM J,KIM J,JANG G J,et al.Fast learning method for convolutional neural networks using extreme learning machine and its application to lane detection[J].Neural Networks,2017,87:109-121.DOI: 10.1016/j.neunet.2016.12.002. [57] GRINSVEN M J J P V,GINNEKEN B V,HOYNG C B,et al.Fast convolutional neural network training using selective data sampling: Application to hemorrhage detection in color fundus images[J].IEEE Transactions on Medical Imaging,2016,35(5):1273-1284.DOI: 10.1109/TMI.2016.2526689. [58] DENG J,DONG W,SOCHER R,et al.ImageNet: A large-scale hierarchical image database [Z].IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009),Miami,FL,USA,2009.DOI: 10.1109/CVPR.2009.5206848. [59] HAN S,MAO H,DALLY W J.Deep compression: compressing deep neural networks with pruning,trained quantization and huffman coding[Z].International Conference on Learning Representations 2016(ICLR 2016),San Juan,Puerto Ri 2016.DOI: 10.1109/TIP.2015.2510583. [60] LI H,LI Y,PORIKLI F.DeepTrack: learning discriminative feature representations online for robust visual tracking[J].IEEE Transactions on Image Processing,2016,25(4):1834-1848.DOI: 10.1109/TIP.2015.2510583. [61] FAN J L,XU W,WU Y,et al.Human tracking using convolutional neural networks[J].IEEE Transactions on Neural Networks,2010,21(10):1610-1623.DOI: 10.1109/TNN.2010.2066286. [62] MA C,XU Y,NI B B,et al.When correlation filters meet convolutional neural networks for visual tracking[J].IEEE Signal Processing Letters,2016,23(10):1454-1458.DOI: 10.1109/LSP.2016.2601691. [63] GIRSHICK R,DONAHUE J,DARRELL T,et al.Region-based convolutional networks for accurate object detection and segmentation[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,38(1):142-158.DOI: 10.1109/TPAMI.2015.2439281. [64] GIRSHICK R.Fast R-CNN [Z].2015 IEEE International Conference on Computer Vision(ICCV2015),Santiago,Chile,2015.DOI: 10.1109/ICCV.2015.169. [65] ZHANG X Y,ZOU J H,HE K M,et al.Accelerating very deep convolutional networks for classification and detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(10):1943-1955.DOI: 10.1109/TPAMI.2015.2439281. [66] TOME D,MONTI F,BAROFFIO L,et al.Deep convolutional neural networks for pedestrian detection[J].Signal Processing: Image Communication,2016,47:482-489.DOI:10.1016/j.image.2016.05.007. [67] REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once: unified,real-time object detection[Z].2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),Las Vegas,NV,United States,2016.DOI:10.1109/CVPR.2016.91. [68] LIU W,ANGUELOV D,ERHAN D,et al.SSD: single shot multiBox detector[Z].European Conference on Computer Vision(ECCV 2016),Amsterdam,The Netherlands,2016.DOI: 10.1007/978-3-319-46448-0_2. [69] SAINATH T N,KINGSBURY B,SAON G,et al.Deep convolutional neural networks for large-scale speech tasks[J].Neural Networks,2015,64:39-48.DOI:10.1016/j.neunet.2014.08.005. [70] QIAN Y M,BI M X,TAN T,et al.Very deep convolutional neural networks for noise robust speech recognition[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2016,24(2):2263-2276.DOI: 10.1109/TASLP.2016.2602884. [71] DONG C,LOY C C,HE K M,et al.Image super-resolution using deep convolutional networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(2):295-307.DOI: 10.1109/TPAMI.2015.2439281. |
[1] | 武建国,杨晓茹,王盼,吴俊芳,李瑞凯. 目标检测在肺癌病理诊断中的应用[J]. 河北大学学报(自然科学版), 2023, 43(5): 546-552. |
[2] | 单腾飞,王鑫桐,华艺枫,康彪,侯学良. 基于DSOD算法的施工塔吊检测方法[J]. 河北大学学报(自然科学版), 2023, 43(5): 539-545. |
[3] | 刘帅奇,雷钰,庞姣,赵淑欢,苏永钢,孙晨阳. 基于生成对抗网络的SAR图像去噪[J]. 河北大学学报(自然科学版), 2022, 42(3): 306-313. |
[4] | 哈艳,孟翔杰,田俊峰. 基于近邻样本联合学习模型的疟疾识别算法[J]. 河北大学学报(自然科学版), 2022, 42(2): 208-216. |
[5] | 李凯,曹可凡,沈皓凝. 基于步态序列的跨视角步态识别[J]. 河北大学学报(自然科学版), 2021, 41(3): 311-320. |
[6] | 翟俊海, 田石, 张素芳, 王谟瀚, 宋丹丹. 基于MapReduce和Spark的大数据模糊K-means算法比较[J]. 河北大学学报(自然科学版), 2020, 40(4): 433-440. |
[7] | 张军,张敏,郝小可,解鹏. 基于多尺度特征融合的中分辨率遥感场景分类算法[J]. 河北大学学报(自然科学版), 2019, 39(6): 666-672. |
[8] | 孙肖肖,牟少敏,许永玉,曹旨昊,苏婷婷. 基于深度学习的复杂背景下茶叶嫩芽检测算法[J]. 河北大学学报(自然科学版), 2019, 39(2): 211-216. |
[9] | 李小燕, 张欣, 闫小兵, 任德亮, 李彦青, 傅长娟. 基于FPGA的卷积神经网络加速系统[J]. 河北大学学报(自然科学版), 2019, 39(1): 99-105. |
[10] | 张素芳, 翟俊海, 王聪, 沈矗, 赵春玲. 大数据与大数据机器学习[J]. 河北大学学报(自然科学版), 2018, 38(3): 299-308. |
[11] | 马国富,王子贤,马胜利. 机器学习模型在预测服刑人员再犯罪危险性中的效用分析[J]. 河北大学学报(自然科学版), 2017, 37(4): 426-433. |
[12] | 陈昀,毕海岩. 基于多特征融合的中文评论情感分类算法[J]. 河北大学学报(自然科学版), 2015, 35(6): 651-656. |
[13] | 李海峰,李纯果. 深度学习结构和算法比较分析[J]. 河北大学学报(自然科学版), 2012, 32(5): 538-544. |
[14] | 张健钦,屈平,邝朴生. 计算机视觉技术在杂草识别中的应用研究进展[J]. 河北大学学报(自然科学版), 2002, 22(4): 410-414. |
[15] | 尤扬,刘明,崔春艳,张春华. 基于计算机视觉的钢轨磨耗自动测量技术的研究[J]. 河北大学学报(自然科学版), 2002, 22(2): 180-183. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||