--- title: text.utils keywords: fastai sidebar: home_sidebar summary: "Various text specific utility classes/functions" description: "Various text specific utility classes/functions" nb_path: "nbs/01_text-utils.ipynb" ---
mh = BlurrText()
mh2 = BlurrText()
test_eq(mh, mh2)
Here's how you can get at the core Hugging Face objects you need to work with ...
... the task
print(NLP.get_tasks())
print("")
print(NLP.get_tasks("bart"))
... the architecture
print(NLP.get_architectures())
print(NLP.get_model_architecture("RobertaForSequenceClassification"))
... and lastly the models (optionally for a given task and/or architecture)
print(L(NLP.get_models())[:5])
print(NLP.get_models(arch="bert")[:5])
print(NLP.get_models(task="TokenClassification")[:5])
print(NLP.get_models(arch="bert", task="TokenClassification"))
Here we define some helpful enums to make it easier to get at the task and architecture you're looking for.
print("--- all tasks ---")
print(L(HF_TASKS))
HF_TASKS.Classification
print(L(HF_ARCHITECTURES)[:5])
How to use:
logging.set_verbosity_error()
from transformers import AutoModelForMaskedLM
arch, config, tokenizer, model = NLP.get_hf_objects("bert-base-cased-finetuned-mrpc", model_cls=AutoModelForMaskedLM)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))
from transformers import AutoModelForQuestionAnswering
arch, tokenizer, config, model = NLP.get_hf_objects("fmikaelian/flaubert-base-uncased-squad", model_cls=AutoModelForQuestionAnswering)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))
from transformers import BertTokenizer, BertForNextSentencePrediction
arch, tokenizer, config, model = NLP.get_hf_objects(
"bert-base-cased-finetuned-mrpc", config=None, tokenizer_cls=BertTokenizer, model_cls=BertForNextSentencePrediction
)
print(arch)
print(type(config))
print(type(tokenizer))
print(type(model))