Interpretable Deep Learning for Efficient Mobile Computing

Date

2020

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Promoted by the evolution of artificial intelligence and deep learning, more and more intelligent applications have emerged on mobile devices. As one of the most representative deep learning technologies, deep neural networks (DNNs) have been considered as a primary tool in computer vision fields. However, the heavy computation, memory, and energy demands of the DNN model restrict their deployment on resource-constrained mobile devices. Therefore, lots of research works have been proposed to optimize the computational efficiency of DNNs. By quantitatively addressing the parameter redundancy, many research works have achieved significant success in accuracy enhancement as well as computational acceleration. However, due to the lack of qualitative model component functionality interpretation, most of these works still require repeated parameter pre-analysis and model retraining until the desired accuracy and computation efficiency trade-off is met. Compared with the quantitative analysis, the qualitative interpretation can help us identify model redundancy in terms of neuron functionality, and better understand the calculation process of neural networks. Therefore, we can achieve further computational efficiency optimization and adaptation to different mobile applications. In this dissertation, I focus on research solutions that enable efficient processing of DNNs by qualitatively interpreting their inside working mechanism (i.e., neuron functionality). I proposed a set of computation optimization approaches for DNN execution on mobile devices through better model interpretability. I first proposed a functionality-oriented convolutional filter pruning method to optimize the DNN algorithm for fast inference. The redundant convolutional filters can be precisely removed without compromising the model functionality integrity and accuracy performance. Furthermore, I proved the proposed method shows consistent filter functionality during the retraining process, demonstrating less retraining effort. To further adapt the DNN to diverse mobile applications, I proposed a class- adaptive DNN reconfiguration framework for mobile applications. This framework can reconfigure a pre-trained full-class DNN model into class-specific small models based on the visualization analysis of convolutional filters’ exclusive functionality for a single class. These lightweight models can be directly deployed to mobile devices without the retraining cost of traditional pruning-based reconfiguration. Finally, to enable training DNN on the mobile system, I proposed a collective edge learning system, which leverages certain DNN filters’ task activation preference to decouple a target DNN model into independently trainable sub-models correspond- ing to a sub-set of learning tasks. With optimal computation and communication efficiency, the target DNN model parameter for all learning tasks can be harvested from well-trained sub-models’ heterogeneous learning tasks/model structures across decentralized edge nodes. As a result, this dissertation provides a novel mobile DNN optimization approach by examining the close combination of the neural network interpretation and the mobile system features for more performance escalation.

Description

Keywords

Citation