河北大学学报(自然科学版) ›› 2019, Vol. 39 ›› Issue (1): 99-105.DOI: 10.3969/j.issn.1000-1565.2019.01.017

• • 上一篇    下一篇

基于FPGA的卷积神经网络加速系统

李小燕1, 张欣1, 闫小兵1, 任德亮1, 李彦青2, 傅长娟2   

  • 收稿日期:2018-09-02 出版日期:2019-01-25 发布日期:2019-01-25
  • 通讯作者: 张欣(1966—),男,河北承德人,河北大学教授,主要从事机器视觉和图像处理方向的研究.E-mail: zhangxin@hbu.edu.cn
  • 作者简介:李小燕(1994—),女,湖北十堰人,河北大学在读硕士研究生,主要从事忆阻器等新型电子器件集成和用于集成的逻辑控制嵌入式电路设计研究. E-mail:18612969742@163.com
  • 基金资助:
    国家自然科学基金资助项目(61674050)

Convolutional neural network acceleration system based on FPGA

LI Xiaoyan1, ZHANG Xin1, YAN Xiaobing1, REN Deliang1, LI Yanqing2, FU Changjuan2   

  1. 1. College of Telecommunications and Information Engineering, Hebei University, Baoding 071002, China; 2. Baoding Yonghong Foundry Machinery Factory, Baoding 072150, China
  • Received:2018-09-02 Online:2019-01-25 Published:2019-01-25

摘要: 以在现场可编程门阵列(FPGA)上部署卷积神经网络为背景,提出了卷积神经网络在硬件上进行并行加速的方案.主要是通过分析卷积神经网络的结构特点,对数据的存储、读取、搬移以流水式的方式进行,对卷积神经网络中的每一层内的卷积运算单元进行展开,加速乘加操作. 基于FPGA特有的并行化结构和流水线的处理方式可以很好地提升运算效率,从对ciafr-10数据集的物体分类结果看,在不损失正确率的前提下,当时钟工作在800 MHz时,相较于中端的Intel处理器,可实现4倍左右的加速.卷积神经网络通过循环展开并行处理以及多级流水线的处理方式,可以加速卷积神经网络的前向传播,适合于实际工程任务中的需要.

关键词: 现场可编程门阵列(FPGA), 卷积神经网络, 并行化, 流水线, 分类, 加速

Abstract: In this paper, the convolutional neural network is deployed on the Field Programmable Gate Array(FPGA). As a background, a convolutional neural network is proposed to accelerate hardware. The paper analyzes the structural characteristics of convolutional neural networks, stores, reads, and moves data in a stream-style manner. Next, the convolution unit in each layer of the convolutional neural network is expanded to speed up the multiplication and addition operations. Based on the(FPGA)unique parallel structure, pipeline processing method can effectively improve the efficiency of the operation. From object classification results for the ciafr-10 dataset, at 800MHz operating frequency and without loss of accuracy, FPGA compared to General purpose processor can achieve 4 times speed up, Convolutional neural network through parallel process and multi-stage pipeline process can accelerate forward propagation of convolutional neural networks, being suitable for the demand of practical engineering tasks.

Key words: field programmable gate array(FPGA), convolutional neural network, parallelization, stream-style, classification, accelerate

中图分类号: