Performance-Aware Coarse-Grained Reconfigurable Logic Accelerator for Deep Learning Application
Date
Authors
Mercado Rejas, Katherine
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Deep neural networks (DNNs) are widely deployed in various cognitive applications including computer vision, speech recognition, and image processing. The surpassing accuracy and performance of deep neural networks come at the cost of high computational complexity. Therefore, software implementations of DNNs and convolutional neural networks (CNNs) are often hindered by computational and communication bottlenecks. As a panacea, numerous hardware accelerators are introduced in recent times to accelerate DNNs and CNNs. Despite effectiveness, the existing hardware accelerators are often confronted by the involved computational complexity and the need for special hardware units to implement each of the DNN/CNN operations.To address such challenges, a reconfigurable DNN/CNN accelerator is proposed in this work. The proposed architecture comprises nine processing elements (PEs) that can perform both convolution and arithmetic operations, through run-time reconfiguration with minimal overhead. To reduce the computational complexity, we employ Mitchell's algorithm, which is supported through low-overhead coarse-grained reconfigurability in this work. To facilitate efficient data flow across the PEs, we pre-compute the dataflow paths and configure the dataflow during the runtime. The proposed design is realized on a field-programmable gate array (FPGA) platform for evaluation.
Description
Keywords
Hardware accelerator, Deep neural networks, VLSI, FPGA, Reconfigurable architecture, Convolutional neural networks