A CNN/MLP Neural Processing Engine, Powered by Novel Temporal-Carry-deferring MACs
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The applications of machine learning algorithms are innumerable and cover nearly every domain of modern technology. During this rapid growth of this area, more and more companies have expressed a desire to utilize machine learning techniques in smaller devices, such as cell phones or smart Internet of Things (IoT) instruments. However, machine learning has so far required a power source with more capacity and higher efficiency than a conventional battery. Therefore, introducing neural network accelerators with low energy demands and low latency for executing machine learning techniques has drawn lots of attention in both academia and industry. In this work, we first propose the design of Temporal-Carry-deferring MAC (TCD-MAC) and illustrate how our proposed solution can gain significant energy and performance benefits when utilized to process a stream of input data. We then propose using the TCD-MAC to build a reconfigurable, high speed, and low power Neural Processing Engine (TCD-NPE). Furthermore, we expand the idea of TCD-MAC to present NESTA, which is a specialized Neural engine that reformats Convolutions into $3 \times 3$ batches and uses a hierarchy of Hamming Weight Compressors to process each batch.