A CNN/MLP Neural Processing Engine, Powered by Novel Temporal-Carry-deferring MACs



Journal Title

Journal ISSN

Volume Title



The applications of machine learning algorithms are innumerable and cover nearly every domain of modern technology. During this rapid growth of this area, more and more companies have expressed a desire to utilize machine learning techniques in smaller devices, such as cell phones or smart Internet of Things (IoT) instruments. However, machine learning has so far required a power source with more capacity and higher efficiency than a conventional battery. Therefore, introducing neural network accelerators with low energy demands and low latency for executing machine learning techniques has drawn lots of attention in both academia and industry. In this work, we first propose the design of Temporal-Carry-deferring MAC (TCD-MAC) and illustrate how our proposed solution can gain significant energy and performance benefits when utilized to process a stream of input data. We then propose using the TCD-MAC to build a reconfigurable, high speed, and low power Neural Processing Engine (TCD-NPE). Furthermore, we expand the idea of TCD-MAC to present NESTA, which is a specialized Neural engine that reformats Convolutions into $3 \times 3$ batches and uses a hierarchy of Hamming Weight Compressors to process each batch.