--- title: examples.multilabel_classification keywords: fastai sidebar: home_sidebar summary: "This is an example of how to use blurr for multilabel classification tasks" description: "This is an example of how to use blurr for multilabel classification tasks" nb_path: "nbs/99a_examples-multilabel.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
 
{% endraw %} {% raw %}
{% endraw %} {% raw %}
torch.cuda.set_device(1)
print(f'Using GPU #{torch.cuda.current_device()}: {torch.cuda.get_device_name()}')
Using GPU #1: GeForce GTX 1080 Ti
{% endraw %}

Let's start by building our DataBlock

{% raw %}
raw_data = datasets.load_dataset('civil_comments', split='train[:1%]') 
len(raw_data)
Using custom data configuration default
Reusing dataset civil_comments (/home/wgilliam/.cache/huggingface/datasets/civil_comments/default/0.9.0/98bdc73fc77a117cf5d17c9977e278c8023c64177a3ed9e0c49f7a5bdf10a47b)
18049
{% endraw %} {% raw %}
toxic_df = pd.DataFrame(raw_data, columns=list(raw_data.features.keys()))
toxic_df.head()
text toxicity severe_toxicity obscene threat insult identity_attack sexual_explicit
0 This is so cool. It's like, 'would you want your mother to read this??' Really great idea, well done! 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
1 Thank you!! This would make my life a lot less anxiety-inducing. Keep it up, and don't let anyone get in your way! 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
2 This is such an urgent design problem; kudos to you for taking it on. Very impressive! 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
3 Is this something I'll be able to install on my site? When will you be releasing it? 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
4 haha you guys are a bunch of losers. 0.893617 0.021277 0.0 0.0 0.87234 0.021277 0.0
{% endraw %} {% raw %}
lbl_cols = list(toxic_df.columns[2:]); lbl_cols
['severe_toxicity',
 'obscene',
 'threat',
 'insult',
 'identity_attack',
 'sexual_explicit']
{% endraw %} {% raw %}
toxic_df = toxic_df.round({col: 0 for col in lbl_cols})
toxic_df = toxic_df.convert_dtypes()

toxic_df.head()
text toxicity severe_toxicity obscene threat insult identity_attack sexual_explicit
0 This is so cool. It's like, 'would you want your mother to read this??' Really great idea, well done! 0.000000 0 0 0 0 0 0
1 Thank you!! This would make my life a lot less anxiety-inducing. Keep it up, and don't let anyone get in your way! 0.000000 0 0 0 0 0 0
2 This is such an urgent design problem; kudos to you for taking it on. Very impressive! 0.000000 0 0 0 0 0 0
3 Is this something I'll be able to install on my site? When will you be releasing it? 0.000000 0 0 0 0 0 0
4 haha you guys are a bunch of losers. 0.893617 0 0 0 1 0 0
{% endraw %}

For our huggingface model, let's used the distilled version of RoBERTa. This should allow us to train the model on bigger mini-batches without much performance loss. Even on my 1080Ti, I should be able to train all the parameters (which isn't possible with the roberta-base model)

{% raw %}
task = HF_TASKS_ALL.SequenceClassification

pretrained_model_name = "distilroberta-base"
config = AutoConfig.from_pretrained(pretrained_model_name)
config.num_labels = len(lbl_cols)

hf_arch, hf_config, hf_tokenizer, hf_model = BLURR_MODEL_HELPER.get_hf_objects(pretrained_model_name, 
                                                                               task=task, 
                                                                               config=config)

print(hf_arch)
print(type(hf_config))
print(type(hf_tokenizer))
print(type(hf_model))
roberta
<class 'transformers.models.roberta.configuration_roberta.RobertaConfig'>
<class 'transformers.models.roberta.tokenization_roberta_fast.RobertaTokenizerFast'>
<class 'transformers.models.roberta.modeling_roberta.RobertaForSequenceClassification'>
{% endraw %}

Note how we have to configure the num_labels to the number of labels we are predicting. Given that our labels are already encoded, we use a MultiCategoryBlock with encoded=True and vocab equal to the columns with our 1's and 0's.

{% raw %}
blocks = (
    HF_TextBlock(hf_arch, hf_config, hf_tokenizer, hf_model), 
    MultiCategoryBlock(encoded=True, vocab=lbl_cols)
)

dblock = DataBlock(blocks=blocks, 
                   get_x=ColReader('text'), get_y=ColReader(lbl_cols), 
                   splitter=RandomSplitter())
{% endraw %} {% raw %}
dls = dblock.dataloaders(toxic_df, bs=16)
{% endraw %} {% raw %}
b = dls.one_batch()
len(b), b[0]['input_ids'].shape, b[1].shape
(2, torch.Size([16, 391]), torch.Size([16, 6]))
{% endraw %}

With our DataLoaders built, we can now build our Learner and train. We'll use mixed precision so we can train with bigger batches

{% raw %}
model = HF_BaseModelWrapper(hf_model)

learn = Learner(dls, 
                model,
                opt_func=partial(Adam),
                loss_func=BCEWithLogitsLossFlat(),
                metrics=[partial(accuracy_multi, thresh=0.2)],
                cbs=[HF_BaseModelCallback],
                splitter=hf_splitter).to_fp16()

learn.loss_func.thresh = 0.2
learn.create_opt()             # -> will create your layer groups based on your "splitter" function
learn.freeze()
{% endraw %} {% raw %}
learn.blurr_summary()
HF_BaseModelWrapper (Input shape: 16 x 391)
============================================================================
Layer (type)         Output Shape         Param #    Trainable 
============================================================================
                     16 x 391 x 768      
Embedding                                 38603520   False     
Embedding                                 394752     False     
Embedding                                 768        False     
LayerNorm                                 1536       True      
Dropout                                                        
Linear                                    590592     False     
Linear                                    590592     False     
Linear                                    590592     False     
Dropout                                                        
Linear                                    590592     False     
LayerNorm                                 1536       True      
Dropout                                                        
____________________________________________________________________________
                     16 x 391 x 3072     
Linear                                    2362368    False     
____________________________________________________________________________
                     16 x 391 x 768      
Linear                                    2360064    False     
LayerNorm                                 1536       True      
Dropout                                                        
Linear                                    590592     False     
Linear                                    590592     False     
Linear                                    590592     False     
Dropout                                                        
Linear                                    590592     False     
LayerNorm                                 1536       True      
Dropout                                                        
____________________________________________________________________________
                     16 x 391 x 3072     
Linear                                    2362368    False     
____________________________________________________________________________
                     16 x 391 x 768      
Linear                                    2360064    False     
LayerNorm                                 1536       True      
Dropout                                                        
Linear                                    590592     False     
Linear                                    590592     False     
Linear                                    590592     False     
Dropout                                                        
Linear                                    590592     False     
LayerNorm                                 1536       True      
Dropout                                                        
____________________________________________________________________________
                     16 x 391 x 3072     
Linear                                    2362368    False     
____________________________________________________________________________
                     16 x 391 x 768      
Linear                                    2360064    False     
LayerNorm                                 1536       True      
Dropout                                                        
Linear                                    590592     False     
Linear                                    590592     False     
Linear                                    590592     False     
Dropout                                                        
Linear                                    590592     False     
LayerNorm                                 1536       True      
Dropout                                                        
____________________________________________________________________________
                     16 x 391 x 3072     
Linear                                    2362368    False     
____________________________________________________________________________
                     16 x 391 x 768      
Linear                                    2360064    False     
LayerNorm                                 1536       True      
Dropout                                                        
Linear                                    590592     False     
Linear                                    590592     False     
Linear                                    590592     False     
Dropout                                                        
Linear                                    590592     False     
LayerNorm                                 1536       True      
Dropout                                                        
____________________________________________________________________________
                     16 x 391 x 3072     
Linear                                    2362368    False     
____________________________________________________________________________
                     16 x 391 x 768      
Linear                                    2360064    False     
LayerNorm                                 1536       True      
Dropout                                                        
Linear                                    590592     False     
Linear                                    590592     False     
Linear                                    590592     False     
Dropout                                                        
Linear                                    590592     False     
LayerNorm                                 1536       True      
Dropout                                                        
____________________________________________________________________________
                     16 x 391 x 3072     
Linear                                    2362368    False     
____________________________________________________________________________
                     16 x 391 x 768      
Linear                                    2360064    False     
LayerNorm                                 1536       True      
Dropout                                                        
Linear                                    590592     True      
Dropout                                                        
____________________________________________________________________________
                     16 x 6              
Linear                                    4614       True      
____________________________________________________________________________

Total params: 82,123,014
Total trainable params: 615,174
Total non-trainable params: 81,507,840

Optimizer used: functools.partial(<function Adam at 0x7f9e73c08e60>)
Loss function: FlattenedLoss of BCEWithLogitsLoss()

Model frozen up to parameter group #2

Callbacks:
  - HF_BaseModelCallback
  - ModelToHalf
  - TrainEvalCallback
  - Recorder
  - ProgressCallback
  - MixedPrecision
{% endraw %} {% raw %}
preds = model(b[0])
preds.logits.shape, preds
(torch.Size([16, 6]),
 SequenceClassifierOutput(loss=None, logits=tensor([[0.2640, 0.1656, 0.2258, 0.0466, 0.0508, 0.0676],
         [0.2646, 0.1665, 0.2266, 0.0406, 0.0518, 0.0704],
         [0.2738, 0.1669, 0.2263, 0.0369, 0.0474, 0.0688],
         [0.2806, 0.1605, 0.2370, 0.0597, 0.0587, 0.0639],
         [0.2766, 0.1522, 0.2283, 0.0651, 0.0754, 0.0732],
         [0.2703, 0.1818, 0.2298, 0.0597, 0.0557, 0.0676],
         [0.2722, 0.1635, 0.2339, 0.0422, 0.0743, 0.0506],
         [0.2707, 0.1681, 0.2197, 0.0515, 0.0735, 0.0570],
         [0.2597, 0.1700, 0.2358, 0.0600, 0.0650, 0.0586],
         [0.2655, 0.1729, 0.2356, 0.0567, 0.0823, 0.0645],
         [0.2839, 0.1757, 0.2334, 0.0695, 0.0663, 0.0675],
         [0.2721, 0.1642, 0.2353, 0.0499, 0.0793, 0.0593],
         [0.2750, 0.1693, 0.2341, 0.0582, 0.0523, 0.0861],
         [0.2730, 0.1797, 0.2180, 0.0473, 0.0696, 0.0625],
         [0.2731, 0.1763, 0.2329, 0.0507, 0.0682, 0.0601],
         [0.2701, 0.1777, 0.2358, 0.0455, 0.0687, 0.0497]], device='cuda:1',
        grad_fn=<AddmmBackward>), hidden_states=None, attentions=None))
{% endraw %} {% raw %}
learn.lr_find(suggestions=True)
/home/wgilliam/anaconda3/envs/blurr/lib/python3.7/site-packages/fastai/learner.py:53: UserWarning: Could not load the optimizer state.
  if with_opt: warn("Could not load the optimizer state.")
SuggestedLRs(lr_min=0.012022644281387329, lr_steep=0.0010000000474974513)
{% endraw %} {% raw %}
learn.fit_one_cycle(1, lr_max=1e-2)
epoch train_loss valid_loss accuracy_multi time
0 0.034677 0.034504 0.993211 01:10
{% endraw %} {% raw %}
learn.unfreeze()
learn.lr_find(suggestions=True, start_lr=1e-12, end_lr=1e-5)
SuggestedLRs(lr_min=5.011872427490572e-13, lr_steep=6.9183096751412876e-12)
{% endraw %} {% raw %}
learn.fit_one_cycle(2, lr_max=slice(1e-10, 4e-9))
epoch train_loss valid_loss accuracy_multi time
0 0.028343 0.034504 0.993211 01:56
1 0.040310 0.034504 0.993211 01:55
{% endraw %} {% raw %}
learn.show_results(learner=learn, max_n=2)
text None target
0 As usual WW plumbing the depths for deeper meaning... that is unless it involves an issue on which they disagree then it is ridicule 24/7. Clever creating the Bundyland series complete with cartoon banner. Set the tone for the level of journalism to expect... journalism? ... fatastisticism. \n\nI did notice you soft pedaling the ridicule of David Fry identifying him as troubled. My guess is that has more to do with sympathy for his pot smoking withdrawl rants than respect for his politics. Respect is never a factor with liberals as evidenced by your series of vapid caricatures. \n\nDid you happen to see the stories actual journalists did on Refuge mis-managment, fires, floods, and the millions of Carp that are harassing the birds away from the Bird Refuge? The stories of arbitrary miss-management that are driving unemployment ever higher in eastern Oregon. Curry County Sheriff turning in his badge in frustration for lack of resources dud to dwindling tax base engineered by arbitrary over reaching Federal Government policies. Or how about the thousands of miles of roads proposed to be removed from Oregon wild lands, cutting off public and fire access? What is the one issue the people of Oregon demand universally? Access to the wild lands? You are not reporting that it is being taken away. More closed roads. More illegal Federal Police, guns drawn, stops... as reported by the Sheriff of Grant County. \n\nI suspect the real problem was real people you couldn't care less about were articulate, were actual victims, and made rational arguments you could not respond to. The exact opposite of the great unbathed OWS movement, apart from the paid organizers, that held downtown Portland hostage and trashed 3 park blocks for 3 weeks. \n\nShame on you. You missed some really great stories and even greater people... people who don't drink $5 cups of coffee. []
1 I think what we need to do is enslave white people, let's say for purposes of their non-white owners' taxation and political representation that white persons count as 3/5ths of a person. Let's keep that institution alive for about 245 years, then whites can fight to end it, and then non-whites can institute laws that segregate white people from the citizens--anybody who has at least one drop of brown or black "blood"--and whites can work for as little as possible for around another hundred years. Then the white people can undertake a civil rights movement so they can fight for "equal" treatment. \nIf white people have the audacity to speak up about what happened to their parents, grandparent and ancestors, everyone scoffs and tells them to stop asking for special treatment because the only non-white people entitled to special treatment in this alternate universe. White people live fearfully as law enforcement uses racial profiling and prison awaits a great many of them due to poverty. []
{% endraw %} {% raw %}
learn.loss_func.thresh = 0.02
{% endraw %} {% raw %}
comment = """
Those damned affluent white people should only eat their own food, like cod cakes and boiled potatoes. 
No enchiladas for them!
"""
learn.blurr_predict(comment)
[(((#1) ['insult'],),
  (#1) [tensor([False, False, False,  True, False, False])],
  (#1) [tensor([9.7419e-06, 1.0209e-02, 1.9717e-04, 2.8167e-02, 2.7792e-03, 4.2888e-04])])]
{% endraw %}

Cleanup