Skip to main content
SHARE
Publication

A Novel Pruning Method for Convolutional Neural Networks Based off Identifying Critical Filters...

by Mihaela Dimovska, Jeremy T Johnston
Publication Type
Conference Paper
Journal Name
Proceedings of the Practice and Experience on Advanced Research Computing
Book Title
Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines
Publication Date
Page Number
63
Volume
None
Issue
None
Conference Name
Practice and Experience in Advanced Research Computing (PEARC 2019)
Conference Location
Chicago, Illinois, United States of America
Conference Sponsor
PEARC
Conference Date
-

Convolutional Neural Networks (CNNs) are one of the most extensively used tools in machine learning, but they are still not well understood and in many cases they are over-parameterized, leading to slow inference and impeding their deployment on low-power devices. In the last few years, many methods for decreasing the number of parameters in a network by pruning its output channels have been suggested, but a very recent work has argued that random pruning of channels performs on-par with state-of-the-art pruning methods. While random and other pruning methods might be effectively used for lowering the number of parameters in a CNN, none of these methods can be used to gain any further understanding of the model that the CNN has built. In this work, we propose a novel method for pruning a network, that at the same time can lead to a better understanding of what the individual filters of the network learn about the data. The method proposed aims to keep only the filters that are "important" for a class. We define a filter as important for a class if its removal has the highest negative impact on the accuracy for that class. We demonstrate that our method is better than random pruning on two networks used on the EMNIST and CIFAR10 datasets. By analyzing the important filters, we find that the important filters in the pruned networks learn features which are more general across classes. We demonstrate the importance and applicability of that observation in two transfer-learning tasks.