--- title: modeling.core keywords: fastai sidebar: home_sidebar summary: "This module contains core custom models, loss functions, and a default layer group splitter for use in applying discriminiative learning rates to your huggingface models trained via fastai" description: "This module contains core custom models, loss functions, and a default layer group splitter for use in applying discriminiative learning rates to your huggingface models trained via fastai" nb_path: "nbs/02_modeling-core.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %} {% raw %}
torch.cuda.set_device(1)
print(f'Using GPU #{torch.cuda.current_device()}: {torch.cuda.get_device_name()}')
Using GPU #1: GeForce GTX 1080 Ti
{% endraw %}

Base splitter, model wrapper, and model callback

{% raw %}
{% endraw %} {% raw %}

hf_splitter[source]

hf_splitter(m)

Splits the huggingface model based on various model architecture conventions

{% endraw %} {% raw %}
{% endraw %} {% raw %}

class HF_BaseModelWrapper[source]

HF_BaseModelWrapper(hf_model) :: Module

Same as nn.Module, but no need for subclasses to call super().__init__

{% endraw %}

Note that HF_baseModelWrapper includes some nifty code for just passing in the things your model needs, as not all transformer architectures require/use the same information.

{% raw %}
{% endraw %} {% raw %}

class HF_BaseModelCallback[source]

HF_BaseModelCallback(before_fit=None, before_epoch=None, before_train=None, before_batch=None, after_pred=None, after_loss=None, before_backward=None, after_backward=None, after_step=None, after_cancel_batch=None, after_batch=None, after_cancel_train=None, after_train=None, before_validate=None, after_cancel_validate=None, after_validate=None, after_cancel_epoch=None, after_epoch=None, after_cancel_fit=None, after_fit=None) :: Callback

Basic class handling tweaks of the training loop by changing a Learner in various events

{% endraw %}

We use a Callback for handling what is returned from the huggingface model ... "the huggingface model will return a tuple in outputs, with the actual predictions and some additional activations (should we want to use them is some regularization scheme)" - from the fastai Transformer's Tutorial

Sequence classification

Below demonstrates how to setup your blurr pipeline for a sequence classification task (e.g., a model that requires a single text input)

{% raw %}
path = untar_data(URLs.IMDB_SAMPLE)
imdb_df = pd.read_csv(path/'texts.csv')
{% endraw %} {% raw %}
imdb_df.head()
label text is_valid
0 negative Un-bleeping-believable! Meg Ryan doesn't even look her usual pert lovable self in this, which normally makes me forgive her shallow ticky acting schtick. Hard to believe she was the producer on this dog. Plus Kevin Kline: what kind of suicide trip has his career been on? Whoosh... Banzai!!! Finally this was directed by the guy who did Big Chill? Must be a replay of Jonestown - hollywood style. Wooofff! False
1 positive This is a extremely well-made film. The acting, script and camera-work are all first-rate. The music is good, too, though it is mostly early in the film, when things are still relatively cheery. There are no really superstars in the cast, though several faces will be familiar. The entire cast does an excellent job with the script.<br /><br />But it is hard to watch, because there is no good end to a situation like the one presented. It is now fashionable to blame the British for setting Hindus and Muslims against each other, and then cruelly separating them into two countries. There is som... False
2 negative Every once in a long while a movie will come along that will be so awful that I feel compelled to warn people. If I labor all my days and I can save but one soul from watching this movie, how great will be my joy.<br /><br />Where to begin my discussion of pain. For starters, there was a musical montage every five minutes. There was no character development. Every character was a stereotype. We had swearing guy, fat guy who eats donuts, goofy foreign guy, etc. The script felt as if it were being written as the movie was being shot. The production value was so incredibly low that it felt li... False
3 positive Name just says it all. I watched this movie with my dad when it came out and having served in Korea he had great admiration for the man. The disappointing thing about this film is that it only concentrate on a short period of the man's life - interestingly enough the man's entire life would have made such an epic bio-pic that it is staggering to imagine the cost for production.<br /><br />Some posters elude to the flawed characteristics about the man, which are cheap shots. The theme of the movie "Duty, Honor, Country" are not just mere words blathered from the lips of a high-brassed offic... False
4 negative This movie succeeds at being one of the most unique movies you've seen. However this comes from the fact that you can't make heads or tails of this mess. It almost seems as a series of challenges set up to determine whether or not you are willing to walk out of the movie and give up the money you just paid. If you don't want to feel slighted you'll sit through this horrible film and develop a real sense of pity for the actors involved, they've all seen better days, but then you realize they actually got paid quite a bit of money to do this and you'll lose pity for them just like you've alr... False
{% endraw %} {% raw %}
task = HF_TASKS_AUTO.SequenceClassification

pretrained_model_name = "roberta-base" # "distilbert-base-uncased" "bert-base-uncased"
hf_arch, hf_config, hf_tokenizer, hf_model = BLURR_MODEL_HELPER.get_hf_objects(pretrained_model_name, task=task)
{% endraw %} {% raw %}
blocks = (HF_TextBlock(hf_arch=hf_arch, hf_tokenizer=hf_tokenizer, padding='max_length'), CategoryBlock)

dblock = DataBlock(blocks=blocks, 
                   get_x=ColReader('text'), get_y=ColReader('label'), 
                   splitter=ColSplitter(col='is_valid'))
{% endraw %} {% raw %}
dls = dblock.dataloaders(imdb_df, bs=4)
{% endraw %} {% raw %}
dls.show_batch(max_n=2)
text category
0 Un-bleeping-believable! Meg Ryan doesn't even look her usual pert lovable self in this, which normally makes me forgive her shallow ticky acting schtick. Hard to believe she was the producer on this dog. Plus Kevin Kline: what kind of suicide trip has his career been on? Whoosh... Banzai!!! Finally this was directed by the guy who did Big Chill? Must be a replay of Jonestown - hollywood style. Wooofff! negative
1 Being from a small town in Illinois myself, I can instantly relate to this movie. Considering the era it was made in, the townsfolk look uncomfortably like a lot of people I grew up with. Yes the plot is so-so. And yes, the Acting is not going to get nominated for an Oscar anytime soon. But that isn't the point. The point is to suspend reality and just have FUN. And this movie has Fun aplenty. From the greedy,uncaring banker to the well meaning,but dimwitted deputy, this movie was made to poke fun at the SciFi genre and small town living at it's best. Who can't smile at the sight of the Enforcer Drone or the Vern Droid? and I LOVED the FarmZoid. Wish I had one when I was growing up. Overall, considering the technology they had available at the time, this is a pleasant romp into one's childhood, when you could sit back on a Saturday afternoon, Popcorn in hand, and laugh at the foibles of small town living. This is a movie I would watch again and again, if for no other reason than to poke fun at myself and my small town ways. positive
{% endraw %}

Training

We'll also add in custom summary methods for blurr learners/models that work with dictionary inputs

{% raw %}
model = HF_BaseModelWrapper(hf_model)

learn = Learner(dls, 
                model,
                opt_func=partial(Adam),
                loss_func=CrossEntropyLossFlat(),
                metrics=[accuracy],
                cbs=[HF_BaseModelCallback],
                splitter=hf_splitter)

learn.create_opt()             # -> will create your layer groups based on your "splitter" function
learn.freeze()
{% endraw %}

.to_fp16() requires a GPU so had to remove for tests to run on github. Let's check that we can get predictions.

{% raw %}
b = dls.one_batch()
{% endraw %} {% raw %}
learn.model(b[0])
(tensor([[0.1398, 0.0416],
         [0.1352, 0.0231],
         [0.1426, 0.0384],
         [0.1496, 0.0332]], device='cuda:1', grad_fn=<AddmmBackward>),)
{% endraw %} {% raw %}
{% endraw %} {% raw %}

blurr_module_summary[source]

blurr_module_summary(learn, *xb)

Print a summary of model using xb

{% endraw %} {% raw %}
{% endraw %} {% raw %}

Learner.blurr_summary[source]

Learner.blurr_summary()

Print a summary of the model, optimizer and loss function.

{% endraw %}

We have to create our own summary methods above because fastai only works where things are represented by a single tensor. But in the case of huggingface transformers, a single sequence is represented by multiple tensors (in a dictionary).

The change to make this work is so minor I think that the fastai library can/will hopefully be updated to support this use case.

{% raw %}
 
{% endraw %} {% raw %}
print(len(learn.opt.param_groups))
4
{% endraw %} {% raw %}
learn.lr_find(suggestions=True)
SuggestedLRs(lr_min=0.00036307806149125097, lr_steep=0.02754228748381138)
{% endraw %} {% raw %}
learn.fit_one_cycle(3, lr_max=1e-3)
epoch train_loss valid_loss accuracy time
0 0.469650 0.510889 0.885000 00:36
1 0.211285 0.331401 0.900000 00:36
2 0.177134 0.294831 0.905000 00:36
{% endraw %}

Showing results

And here we creat a @typedispatched impelmentation of Learner.show_results.

{% raw %}
{% endraw %} {% raw %}
learn.show_results(max_n=2)
text category target
0 This very funny British comedy shows what might happen if a section of London, in this case Pimlico, were to declare itself independent from the rest of the UK and its laws, taxes & post-war restrictions. Merry mayhem is what would happen.<br /><br />The explosion of a wartime bomb leads to the discovery of ancient documents which show that Pimlico was ceded to the Duchy of Burgundy centuries ago, a small historical footnote long since forgotten. To the new Burgundians, however, this is an unexpected opportunity to live as they please, free from any interference from Whitehall.<br /><br />Stanley Holloway is excellent as the minor city politician who suddenly finds himself leading one of the world's tiniest nations. Dame Margaret Rutherford is a delight as the history professor who sides with Pimlico. Others in the stand-out cast include Hermione Baddeley, Paul Duplis, Naughton Wayne, Basil Radford & Sir Michael Hordern.<br /><br />Welcome to Burgundy! positive positive
1 Before Lost everything shown on TV was predictable. You could predict who was gonna die or who will find something, but in Lost you could predict NOTHING. Every thing was so surprisingly stunning and it really was a mystery not because it has so many secrets but because there was nothing like it before everything was so great. I literally became addicted to it. LOST is a classic work of art. It gives you something to look forward to every week. It is genius. The surrounding is brilliant it is calm and warm at the beach and so scary in the jungle. The characters are a work of genius every one of them especially the ones already on the island. The castaways are so dramatic yet we can never predict their deaths because they have so much more to do and so much more to say and they have secrets affecting the other castaways that die with them. positive positive
{% endraw %} {% raw %}
{% endraw %} {% raw %}

Learner.blurr_predict[source]

Learner.blurr_predict(item, rm_type_tfms=None, with_input=False)

{% endraw %}

Same as with summary, we need to replace fastai's Learner.predict method with the one above which is able to work with inputs that are represented by multiple tensors included in a dictionary.

{% raw %}
learn.blurr_predict('I really liked the movie')
('positive', tensor(1), tensor([0.0686, 0.9314]))
{% endraw %} {% raw %}
learn.unfreeze()
{% endraw %} {% raw %}
learn.fit_one_cycle(3, lr_max=slice(1e-6, 1e-3))
epoch train_loss valid_loss accuracy time
0 0.196894 0.506864 0.825000 00:52
1 0.114689 0.319240 0.930000 00:52
2 0.144111 0.386641 0.905000 00:52
{% endraw %} {% raw %}
learn.recorder.plot_loss()
{% endraw %} {% raw %}
learn.show_results(max_n=2)
text category target
0 This very funny British comedy shows what might happen if a section of London, in this case Pimlico, were to declare itself independent from the rest of the UK and its laws, taxes & post-war restrictions. Merry mayhem is what would happen.<br /><br />The explosion of a wartime bomb leads to the discovery of ancient documents which show that Pimlico was ceded to the Duchy of Burgundy centuries ago, a small historical footnote long since forgotten. To the new Burgundians, however, this is an unexpected opportunity to live as they please, free from any interference from Whitehall.<br /><br />Stanley Holloway is excellent as the minor city politician who suddenly finds himself leading one of the world's tiniest nations. Dame Margaret Rutherford is a delight as the history professor who sides with Pimlico. Others in the stand-out cast include Hermione Baddeley, Paul Duplis, Naughton Wayne, Basil Radford & Sir Michael Hordern.<br /><br />Welcome to Burgundy! positive positive
1 One doesn't get to enjoy this gem, the 1936 Invisible Ray, often. But no can forget it. The story is elegant. Karloff, austere and embittered in his Carpathian mountain retreat, is Janos Rukh, genius science who reads ancient beams of light to ascertain events in the great geological pastÂ…particularly the crash of a potent radioactive meteor in Africa. Joining him is the ever-elegant Lugosi (as a rare hero), who studies "astro-chemistry." Frances Drake is the lovely, underused young wife; Frank Lawton the romantic temptation; and the divine Violet Kemble Cooper is Mother Rukh, in a performance worthy of Maria Ospenskya.<br /><br />The story moves swiftly in bold episodes, with special effects that are still handsome. It also contains some wonderful lines. One Rukh restores his mother's sight, he asks, "Mother, can you see, can you see?" "Yes, I can seeÂ…more clearly than ever. And what I see frightens me." Even better when mother Rukh says, "He broke the first law of science." I am not alone among my acquaintance in having puzzled for many many years exactly what this first law of science is.<br /><br />This movie is definitely desert island material. positive positive
{% endraw %} {% raw %}
learn.blurr_predict("This was a really good movie")
('positive', tensor(1), tensor([0.0199, 0.9801]))
{% endraw %} {% raw %}
learn.blurr_predict("Acting was so bad it was almost funny.")
('negative', tensor(0), tensor([0.9955, 0.0045]))
{% endraw %}

Inference

{% raw %}
learn.export(fname='seq_class_learn_export.pkl')
{% endraw %} {% raw %}
inf_learn = load_learner(fname='seq_class_learn_export.pkl')
inf_learn.blurr_predict("This movie should not be seen by anyone!!!!")
('negative', tensor(0), tensor([0.9915, 0.0085]))
{% endraw %}

Tests

The tests below to ensure the core training code above works for all pretrained sequence classification models available in huggingface. These tests are excluded from the CI workflow because of how long they would take to run and the amount of data that would be required to download.

Note: Feel free to modify the code below to test whatever pretrained classification models you are working with ... and if any of your pretrained sequence classification models fail, please submit a github issue (or a PR if you'd like to fix it yourself)

{% raw %}
try: del learn; torch.cuda.empty_cache()
except: pass
{% endraw %} {% raw %}
BLURR_MODEL_HELPER.get_models(task='SequenceClassification')
[transformers.modeling_albert.AlbertForSequenceClassification,
 transformers.modeling_auto.AutoModelForSequenceClassification,
 transformers.modeling_bart.BartForSequenceClassification,
 transformers.modeling_bert.BertForSequenceClassification,
 transformers.modeling_camembert.CamembertForSequenceClassification,
 transformers.modeling_distilbert.DistilBertForSequenceClassification,
 transformers.modeling_electra.ElectraForSequenceClassification,
 transformers.modeling_flaubert.FlaubertForSequenceClassification,
 transformers.modeling_longformer.LongformerForSequenceClassification,
 transformers.modeling_mobilebert.MobileBertForSequenceClassification,
 transformers.modeling_roberta.RobertaForSequenceClassification,
 transformers.modeling_xlm.XLMForSequenceClassification,
 transformers.modeling_xlm_roberta.XLMRobertaForSequenceClassification,
 transformers.modeling_xlnet.XLNetForSequenceClassification]
{% endraw %} {% raw %}
pretrained_model_names = [
    'albert-base-v1',
    'facebook/bart-base',
    'bert-base-uncased',
    'camembert-base',
    'distilbert-base-uncased',
    'monologg/electra-small-finetuned-imdb',
    'flaubert/flaubert_small_cased', 
    'allenai/longformer-base-4096',
    'google/mobilebert-uncased',
    'roberta-base',
    'xlm-mlm-en-2048',
    'xlm-roberta-base',
    'xlnet-base-cased'
]
{% endraw %} {% raw %}
path = untar_data(URLs.IMDB_SAMPLE)

model_path = Path('models')
imdb_df = pd.read_csv(path/'texts.csv')
{% endraw %} {% raw %}
#hide_output
task = HF_TASKS_AUTO.SequenceClassification
bsz = 2

test_results = []
for model_name in pretrained_model_names:
    error=None
    
    print(f'=== {model_name} ===\n')
    
    hf_arch, hf_config, hf_tokenizer, hf_model = BLURR_MODEL_HELPER.get_hf_objects(model_name, 
                                                                                   task=task, 
                                                                                   config_kwargs={'num_labels': 2})
    
    print(f'architecture:\t{hf_arch}\ntokenizer:\t{type(hf_tokenizer).__name__}\nmodel:\t\t{type(hf_model).__name__}\n')

    blocks = (HF_TextBlock(hf_arch=hf_arch, hf_tokenizer=hf_tokenizer, max_length=128, padding='max_length'), 
              CategoryBlock)

    dblock = DataBlock(blocks=blocks, 
                       get_x=ColReader('text'), 
                       get_y=ColReader('label'), 
                       splitter=ColSplitter(col='is_valid'))
    
    dls = dblock.dataloaders(imdb_df, bs=bsz)
    
    model = HF_BaseModelWrapper(hf_model)
    learn = Learner(dls, 
                    model,
                    opt_func=partial(Adam),
                    loss_func=CrossEntropyLossFlat(),
                    metrics=[accuracy],
                    cbs=[HF_BaseModelCallback],
                    splitter=hf_splitter)

    learn.create_opt()             # -> will create your layer groups based on your "splitter" function
    learn.freeze()
    
    b = dls.one_batch()
    
    try:
        print('*** TESTING DataLoaders ***')
        test_eq(len(b), bsz)
        test_eq(len(b[0]['input_ids']), bsz)
        test_eq(b[0]['input_ids'].shape, torch.Size([bsz, 128]))
        test_eq(len(b[1]), bsz)

        print('*** TESTING One pass through the model ***')
        preds = learn.model(b[0])
        test_eq(len(preds[0]), bsz)
        test_eq(preds[0].shape, torch.Size([bsz, 2]))

        print('*** TESTING Training/Results ***')
        learn.fit_one_cycle(1, lr_max=1e-3)

        test_results.append((hf_arch, type(hf_tokenizer).__name__, type(hf_model).__name__, 'PASSED', ''))
        learn.show_results(max_n=2)
    except Exception as err:
        test_results.append((hf_arch, type(hf_tokenizer).__name__, type(hf_model).__name__, 'FAILED', err))
    finally:
        # cleanup
        del learn; torch.cuda.empty_cache()
{% endraw %} {% raw %}
arch tokenizer model result error
0 albert AlbertTokenizer AlbertForSequenceClassification PASSED
1 bart BartTokenizer BartForSequenceClassification PASSED
2 bert BertTokenizer BertForSequenceClassification PASSED
3 camembert CamembertTokenizer CamembertForSequenceClassification PASSED
4 distilbert DistilBertTokenizer DistilBertForSequenceClassification PASSED
5 electra ElectraTokenizer ElectraForSequenceClassification PASSED
6 flaubert FlaubertTokenizer FlaubertForSequenceClassification PASSED
7 longformer LongformerTokenizer LongformerForSequenceClassification PASSED
8 mobilebert MobileBertTokenizer MobileBertForSequenceClassification PASSED
9 roberta RobertaTokenizer RobertaForSequenceClassification PASSED
10 xlm XLMTokenizer XLMForSequenceClassification PASSED
11 xlm_roberta XLMRobertaTokenizer XLMRobertaForSequenceClassification PASSED
12 xlnet XLNetTokenizer XLNetForSequenceClassification PASSED
{% endraw %}

Example: Multi-label classification

Below demonstrates how to setup your blurr pipeline for a multi-label classification task

{% raw %}
raw_data = nlp.load_dataset('civil_comments', split='train[:1%]') 
len(raw_data)
Using custom data configuration default
18049
{% endraw %} {% raw %}
toxic_df = pd.DataFrame(raw_data, columns=list(raw_data.features.keys()))
toxic_df.head()
text toxicity severe_toxicity obscene threat insult identity_attack sexual_explicit
0 This is so cool. It's like, 'would you want your mother to read this??' Really great idea, well done! 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
1 Thank you!! This would make my life a lot less anxiety-inducing. Keep it up, and don't let anyone get in your way! 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
2 This is such an urgent design problem; kudos to you for taking it on. Very impressive! 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
3 Is this something I'll be able to install on my site? When will you be releasing it? 0.000000 0.000000 0.0 0.0 0.00000 0.000000 0.0
4 haha you guys are a bunch of losers. 0.893617 0.021277 0.0 0.0 0.87234 0.021277 0.0
{% endraw %} {% raw %}
lbl_cols = list(toxic_df.columns[2:]); lbl_cols
['severe_toxicity',
 'obscene',
 'threat',
 'insult',
 'identity_attack',
 'sexual_explicit']
{% endraw %} {% raw %}
toxic_df = toxic_df.round({col: 0 for col in lbl_cols})
toxic_df = toxic_df.convert_dtypes()

toxic_df.head()
text toxicity severe_toxicity obscene threat insult identity_attack sexual_explicit
0 This is so cool. It's like, 'would you want your mother to read this??' Really great idea, well done! 0.000000 0 0 0 0 0 0
1 Thank you!! This would make my life a lot less anxiety-inducing. Keep it up, and don't let anyone get in your way! 0.000000 0 0 0 0 0 0
2 This is such an urgent design problem; kudos to you for taking it on. Very impressive! 0.000000 0 0 0 0 0 0
3 Is this something I'll be able to install on my site? When will you be releasing it? 0.000000 0 0 0 0 0 0
4 haha you guys are a bunch of losers. 0.893617 0 0 0 1 0 0
{% endraw %} {% raw %}
task = HF_TASKS_AUTO.SequenceClassification

pretrained_model_name = "roberta-base" # "distilbert-base-uncased" "bert-base-uncased"
config = AutoConfig.from_pretrained(pretrained_model_name)
config.num_labels = len(lbl_cols)

hf_arch, hf_config, hf_tokenizer, hf_model = BLURR_MODEL_HELPER.get_hf_objects(pretrained_model_name, 
                                                                               task=task, 
                                                                               config=config)
{% endraw %}

Note how we have to configure the num_labels to the number of labels we are predicting. Given that our labels are already encoded, we use a MultiCategoryBlock with encoded=True and vocab equal to the columns with our 1's and 0's.

{% raw %}
blocks = (
    HF_TextBlock(hf_arch=hf_arch, hf_tokenizer=hf_tokenizer), 
    MultiCategoryBlock(encoded=True, vocab=lbl_cols)
)

dblock = DataBlock(blocks=blocks, 
                   get_x=ColReader('text'), get_y=ColReader(lbl_cols), 
                   splitter=RandomSplitter())
{% endraw %} {% raw %}
dls = dblock.dataloaders(toxic_df, bs=4)
{% endraw %} {% raw %}
b = dls.one_batch()
len(b), b[0]['input_ids'].shape, b[1].shape
(2, torch.Size([4, 184]), torch.Size([4, 6]))
{% endraw %} {% raw %}
dls.show_batch(max_n=2)
text None
0 An excellent position on this bill from a person who clearly has serious credentials on energy issues. Thank you Mr. Hamilton for weighing in with your experienced view. I'm more convinced than ever that this is a huge win for Oregonians. I sure hope the legislature moves this bill through!
1 Quit funding the Ambler Mine road first.
{% endraw %} {% raw %}
model = HF_BaseModelWrapper(hf_model)

learn = Learner(dls, 
                model,
                opt_func=partial(Adam),
                loss_func=BCEWithLogitsLossFlat(),
                metrics=[partial(accuracy_multi, thresh=0.2)],
                cbs=[HF_BaseModelCallback],
                splitter=hf_splitter)

learn.loss_func.thresh = 0.2
learn.create_opt()             # -> will create your layer groups based on your "splitter" function
learn.freeze()
{% endraw %}

Since we're doing multi-label classification, we adjust our loss function to use binary cross-entropy and our metrics to use the multi-label friendly version of accuracy.

{% raw %}
preds = model(b[0])
preds[0].shape
torch.Size([4, 6])
{% endraw %} {% raw %}
learn.lr_find(suggestions=True)
SuggestedLRs(lr_min=0.19054607152938843, lr_steep=0.0014454397605732083)
{% endraw %} {% raw %}
learn.fit_one_cycle(3, lr_max=3e-3)
epoch train_loss valid_loss accuracy_multi time
0 0.030272 0.036244 0.992657 03:21
1 0.035246 0.035990 0.992657 03:22
2 0.028292 0.036048 0.992657 03:23
{% endraw %} {% raw %}
learn.show_results(max_n=2)
text None target
0 LOL at #10. I've had Uber drivers get lost so many times in inner SE. Buckman. It's a GRID, y'all, and I don't feel like paying for the time you wasted driving around in circles. []
1 Hmm. New comment service! Hey this article was great. The fire was almost certainly karma []
{% endraw %} {% raw %}
learn.loss_func.thresh = 0.02
{% endraw %} {% raw %}
comment = """
Those damned affluent white people should only eat their own food, like cod cakes and boiled potatoes. 
No enchiladas for them!
"""
learn.blurr_predict(comment)
((#1) ['severe_toxicity'],
 tensor([False, False, False,  True, False, False]),
 tensor([1.4656e-06, 4.0468e-03, 4.2132e-04, 2.4139e-02, 1.1726e-03, 1.2957e-03]))
{% endraw %}

Cleanup