--- title: Sparsifier keywords: fastai sidebar: home_sidebar summary: "Make your neural network sparse" description: "Make your neural network sparse" nb_path: "nbs/01a_sparsifier.ipynb" ---
A sparse vector, as opposed to a dense one, is a vector which contains a lot of zeroes. When we speak about making a neural network sparse, we thus mean that the network's weight are mostly zeroes.
With fasterai, you can do that thanks to the Sparsifier
class.
Let's start by creating a model
model = resnet18()
As you probably know, weights in a convolutional neural network have 4 dimensions ($ c_{out} \times c_{in} \times k_h \times k_w$)
model.conv1.weight.ndim
In the case of ResNet18, the dimension of the first layer weights is $64 \times 3 \times 7 \times 7$. We thus can plot each of the $64$ filter as a $7 \times 7$ color image (because they contains $3$ channels).
plot_kernels(model.conv1)
The Sparsifier
class allows us to remove some (part of) the filters, that are considered to be less useful than others. This can be done by first creating an instance of the class, specifying:
granularity
, i.e. the part of filters that you want to remove. Typically, we usually remove weights, vectors, kernels or even complete filters.method
, i.e. if you want to consider each layer independently (local
), or compare the parameters to remove across the whole network (global
).criteria
, i.e. the way to assess the usefulness of a parameter. Common methods compare parameters using their magnitude, the lowest magnitude ones considered to be less useful.User can pass a single layer to prune by using the Sparsifier.prune_layer
method.
model = resnet18()
pruner = Sparsifier(model, 'filter', 'local', large_final)
pruner.prune_layer(model.conv1, 70)
pruner.print_sparsity()
Most of the time, we may want to prune the whole model at once, using the Sparsifier.prune_model
method, indicating the percentage of sparsity to you want to apply.
There are several ways in which we can make that first layer sparse. You will find the most important below:
model = resnet18()
pruner = Sparsifier(model, 'weight', 'local', large_final)
pruner.prune_model(70)
pruner.print_sparsity()
You now have a model that is $70\%$ sparse !
As we said earlier, the granularity
defines the structure of parameter that you will remove.
In the example below, we removed weight
from each convolutional filter, meaning that we now have sparse filters, as can be seen in the image below:
plot_kernels(model.conv1)
Another granularity is, for example, removing column
vectors from the filters. To do so, just change the granularity parameter accordingly.
model = resnet18()
pruner = Sparsifier(model, 'column', 'local', large_final)
pruner.prune_layer(model.conv1, 70)
plot_kernels(model.conv1)
For more information and examples about the pruning granularities, I suggest you to take a look at the corresponding section.
The method defines where to look in the model, i.e. from where do we compare weight. The two basic methods are:
local
, i.e. we compare weight from each layer individually. This will lead to layers with similar levels of sparsity.global
, i.e. we compare weight from the whole model. This will lead to layers with different levels of sparsitymodel = resnet18()
pruner = Sparsifier(model, 'weight', 'local', large_final)
pruner.prune_model(70)
pruner.print_sparsity()
model = resnet18()
pruner = Sparsifier(model, 'weight', 'global', large_final)
pruner.prune_model(70)
pruner.print_sparsity()
The criteria
defines how we select the parameters to remove. It is usually given by a scoring method. The most common one is the large_final
, i.e. select parameters with the highest absolute value as they are supposed to contribute the most to the final results of the model.
model = resnet18()
pruner = Sparsifier(model, 'filter', 'global', large_final)
pruner.prune_model(70)
pruner.print_sparsity()
model = resnet18()
pruner = Sparsifier(model, 'filter', 'global', small_final)
pruner.prune_model(70)
pruner.print_sparsity()
For more information and examples about the pruning criteria, I suggest you to take a look at the corresponding section.
In some case, you may want to impose the remaining amount of parameters to be a multiple of 8, this can be done by passing the round_to
parameter.
model = resnet18()
pruner = Sparsifier(model, 'filter', 'local', large_final)
pruner.prune_model(70, round_to=8)
pruner.print_sparsity()
model = resnet18()
pruner = Sparsifier(model, 'filter', 'global', large_final)
pruner.prune_model(70, round_to=8)
pruner.print_sparsity()
For more information about granularities at which you can operate, please check the related page.