Paper

Designing Novel AAD Pooling in Hardware for a Convolutional Neural Network Accelerator

Volume Number:
30
Issue Number:
3
Pages:
Starting page
303
Ending page
314
Publication Date:
Publication Date
25 January 2022
Author(s)
Kasem Khalil, Omar Eldash, Ashok Kumar and Magdy Bayoumi

paper Menu

Abstract

Convolutional neural network (CNN) hardware accelerators for specialized Internet of Things (IoT) requiring high accuracy is an emerging research topic. The pooling module in a CNN pipeline impacts both the speed and accuracy of a classification task. This work proposes the design and hardware implementation of a novel pooling method absolute average deviation (AAD) for CNN accelerator. AAD utilizes the spatial locality of pixels using vertical and horizontal deviations to achieve higher accuracy, lower area, and lower power consumption than mixed pooling without increasing the computational complexity. AAD is tested on four different datasets: EEG, ImageNet, Common Objects in Context (COCO), United States Postal Service (USPS), and multiple CNN structures: CNN, VGG16, VGG19, ResNet, and DenseNet. In hardware, AAD is implemented using Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL) on Altera Arria10 GX field-programmable gate array (FPGA) and 45-nm technology using Synopsys Design Compiler. The area and power consumption are found to be 244.46 nm 2 and 0.31 mW, respectively. AAD achieves 98% accuracy with lower computational and hardware costs compared to mixed pooling, making it an ideal pooling mechanism for an IoT CNN accelerator.