--- title: utils keywords: fastai sidebar: home_sidebar summary: "Various utility functions used by the blurr package." description: "Various utility functions used by the blurr package." nb_path: "nbs/00_utils.ipynb" ---
torch.cuda.set_device(1)
print(f'Using GPU #{torch.cuda.current_device()}: {torch.cuda.get_device_name()}')
@Singleton
class TestSingleton: pass
a = TestSingleton()
b = TestSingleton()
test_eq(a,b)
ModelHelper
is a Singleton
(there exists only one instance, and the same instance is returned upon subsequent instantiation requests). You can get at via the BLURR_MODEL_HELPER
constant below.
mh = ModelHelper()
mh2 = ModelHelper()
test_eq(mh, mh2)
display_df(mh._df.head(20))
Users of this library can simply use BLURR_MODEL_HELPER
to access all the ModelHelper
capabilities without having to fetch an instance themselves.
Here's how you can get at the core huggingface objects you need to work with ...
... the task
print(BLURR_MODEL_HELPER.get_tasks())
print('')
print(BLURR_MODEL_HELPER.get_tasks('bart'))
... the architecture
print(BLURR_MODEL_HELPER.get_architectures())
print(BLURR_MODEL_HELPER.get_model_architecture('RobertaForSequenceClassification'))
... the config for that particular task and architecture
print(BLURR_MODEL_HELPER.get_config('bert'))
... the available tokenizers for that architecture
print(BLURR_MODEL_HELPER.get_tokenizers('electra'))
... and lastly the models (optionally for a given task and/or architecture)
print(L(BLURR_MODEL_HELPER.get_models())[:5])
print(BLURR_MODEL_HELPER.get_models(arch='bert')[:5])
print(BLURR_MODEL_HELPER.get_models(task='TokenClassification')[:5])
print(BLURR_MODEL_HELPER.get_models(arch='bert', task='TokenClassification'))
Here we define some helpful enums to make it easier to get at the architecture and task you're looking for.
print(L(HF_ARCHITECTURES)[:5])
print('--- all tasks ---')
print(L(HF_TASKS_ALL))
print('\n--- auto only ---')
print(L(HF_TASKS_AUTO))
HF_TASKS_ALL.Classification
BLURR_MODEL_HELPER.get_classes_for_model
can be used to get the config, tokenizer, and model classes you want
config, tokenizers, model = BLURR_MODEL_HELPER.get_classes_for_model('RobertaForSequenceClassification')
print(config)
print(tokenizers[0])
print(model)
config, tokenizers, model = BLURR_MODEL_HELPER.get_classes_for_model(DistilBertModel)
print(config)
print(tokenizers[0])
print(model)
arch, config, tokenizer, model = BLURR_MODEL_HELPER.get_hf_objects("bert-base-cased-finetuned-mrpc",
task=HF_TASKS_AUTO.MaskedLM)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))
arch, tokenizer, config, model = BLURR_MODEL_HELPER.get_hf_objects("fmikaelian/flaubert-base-uncased-squad",
task=HF_TASKS_AUTO.QuestionAnswering)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))
arch, tokenizer, config, model = BLURR_MODEL_HELPER.get_hf_objects("bert-base-cased-finetuned-mrpc",
config=None,
tokenizer_cls=BertTokenizer,
model_cls=BertForNextSentencePrediction)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))