--- title: Knowledge Distillation keywords: fastai sidebar: home_sidebar summary: "How to apply knowledge distillation with fasterai" description: "How to apply knowledge distillation with fasterai" nb_path: "nbs/04b_tutorial.knowledge_distillation.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
 
{% endraw %}

We'll illustrate how to use Knowledge Distillation to distill the knowledge of a Resnet34 (the teacher), to a Resnet18 (the student)

Let's us grab some data

{% raw %}
path = untar_data(URLs.PETS)
files = get_image_files(path/"images")

def label_func(f): return f[0].isupper()

dls = ImageDataLoaders.from_name_func(path, files, label_func, item_tfms=Resize(64))
{% endraw %}

The first step is then to train the teacher model. We'll start from a pretrained model, ensuring to get good results on our dataset.

{% raw %}
teacher = cnn_learner(dls, resnet34, metrics=accuracy)
teacher.unfreeze()
teacher.fit_one_cycle(5)
epoch train_loss valid_loss accuracy time
0 0.657599 0.669229 0.807172 00:08
1 0.395813 0.490642 0.873478 00:09
2 0.278736 0.314575 0.884980 00:09
3 0.161812 0.184702 0.934371 00:09
4 0.088248 0.182757 0.939784 00:09
{% endraw %}

Without KD

We'll now train a Resnet18 from scratch, and without any help from the teacher model, to get that as a baseline

{% raw %}
student = Learner(dls, resnet18(num_classes=2), metrics=accuracy)
student.fit_one_cycle(5)
epoch train_loss valid_loss accuracy time
0 0.602132 0.855898 0.655616 00:08
1 0.553751 0.618787 0.698241 00:08
2 0.505094 0.563866 0.740866 00:09
3 0.437325 0.454875 0.784844 00:08
4 0.355464 0.428473 0.805142 00:08
{% endraw %}

With KD

And now we train the same model, but with the help of the teacher.

{% raw %}
loss = partial(SoftTarget, T=30)
{% endraw %} {% raw %}
student = Learner(dls, resnet18(num_classes=2), metrics=accuracy)
kd = KnowledgeDistillation(teacher, loss)
student.fit_one_cycle(5, cbs=kd)
epoch train_loss valid_loss accuracy time
0 0.612727 1.050914 0.654939 00:09
1 0.556448 0.520859 0.730717 00:09
2 0.503152 0.468746 0.770636 00:09
3 0.429105 0.406935 0.811231 00:07
4 0.342641 0.401033 0.819350 00:08
{% endraw %}

When helped, the student model performs better !