--- title: utils keywords: fastai sidebar: home_sidebar summary: "Various utility functions used by the blurr package." description: "Various utility functions used by the blurr package." nb_path: "nbs/00_utils.ipynb" ---
torch.cuda.set_device(1)
print(f'Using GPU #{torch.cuda.current_device()}: {torch.cuda.get_device_name()}')
@Singleton
class TestSingleton: pass
a = TestSingleton()
b = TestSingleton()
test_eq(a,b)
mh = BlurrUtil()
mh2 = BlurrUtil()
test_eq(mh, mh2)
display_df(mh._df.head(20))
Here's how you can get at the core huggingface objects you need to work with ...
... the task
print(BLURR.get_tasks())
print('')
print(BLURR.get_tasks('bart'))
... the architecture
print(BLURR.get_architectures())
print(BLURR.get_model_architecture('RobertaForSequenceClassification'))
... the config for that particular task and architecture
print(BLURR.get_config('bert'))
... the available tokenizers for that architecture
print(BLURR.get_tokenizers('electra'))
... and lastly the models (optionally for a given task and/or architecture)
print(L(BLURR.get_models())[:5])
print(BLURR.get_models(arch='bert')[:5])
print(BLURR.get_models(task='TokenClassification')[:5])
print(BLURR.get_models(arch='bert', task='TokenClassification'))
Here we define some helpful enums to make it easier to get at the architecture and task you're looking for.
print(L(HF_ARCHITECTURES)[:5])
print('--- all tasks ---')
print(L(HF_TASKS))
HF_TASKS.Classification
BLURR.get_classes_for_model
can be used to get the config, tokenizer, and model classes you want
config, tokenizers, model = BLURR.get_classes_for_model('RobertaForSequenceClassification')
print(config)
print(tokenizers[0])
print(model)
config, tokenizers, model = BLURR.get_classes_for_model(DistilBertModel)
print(config)
print(tokenizers[0])
print(model)
arch, config, tokenizer, model = BLURR.get_hf_objects("bert-base-cased-finetuned-mrpc",
model_cls=AutoModelForMaskedLM)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))
arch, tokenizer, config, model = BLURR.get_hf_objects("fmikaelian/flaubert-base-uncased-squad",
model_cls=AutoModelForQuestionAnswering)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))
arch, tokenizer, config, model = BLURR.get_hf_objects("bert-base-cased-finetuned-mrpc",
config=None,
tokenizer_cls=BertTokenizer,
model_cls=BertForNextSentencePrediction)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))