Named Entity Recognition¶
Named Entity Recognition related modeling class
-
class
pororo.tasks.named_entity_recognition.
PororoNerFactory
(task: str, lang: str, model: Optional[str])[source]¶ Bases:
pororo.tasks.utils.base.PororoFactoryBase
Conduct named entity recognition
English (roberta.base.en.ner)
dataset: OntoNotes 5.0
metric: F1 (91.63)
Korean (charbert.base.ko.ner)
dataset: https://corpus.korean.go.kr/ 개체명 분석 말뭉치
metric: F1 (89.63)
Japanese (jaberta.base.ja.ner)
dataset: Kyoto University Web Document Leads Corpus
metric: F1 (76.74)
Chinese (zhberta.base.zh.ner)
dataset: OntoNotes 5.0
metric: F1 (79.06)
- Parameters
sent – (str) sentence to be sequence labeled
- Returns
token and its predicted tag tuple list
- Return type
Examples
>>> ner = Pororo(task="ner") >>> ner("It was in midfield where Arsenal took control of the game, and that was mainly down to Thomas Partey and Mohamed Elneny.") [('It', 'O'), ('was', 'O'), ('in', 'O'), ('midfield', 'O'), ('where', 'O'), ('Arsenal', 'ORG'), ('took', 'O'), ('control', 'O'), ('of', 'O'), ('the', 'O'), ('game', 'O'), (',', 'O'), ('and', 'O'), ('that', 'O'), ('was', 'O'), ('mainly', 'O'), ('down', 'O'), ('to', 'O'), ('Thomas Partey', 'PERSON'), ('and', 'O'), ('Mohamed Elneny', 'PERSON'), ('.', 'O')] >>> ner = Pororo(task="ner", lang="ko") >>> ner("안녕하세요. 제 이름은 카터입니다.") [("안녕하세요.", "O"), (" ", "O"), ("제", "O"), ("이름은", "O"), ("카터", "PS"), ("입니다.", "O")] >>> ner = Pororo(task="ner", lang="zh") >>> ner("毛泽东(1893年12月26日-1976年9月9日),字润之,湖南湘潭人。中华民国大陆时期、中国共产党和中华人民共和国的重要政治家、经济家、军事家、战略家、外交家和诗人。") [('毛泽东', 'PERSON'), ('(', 'O'), ('1893年12月26日-1976年9月9日', 'DATE'), (')', 'O'), (',', 'O'), ('字润之', 'O'), (',', 'O'), ('湖南', 'GPE'), ('湘潭', 'GPE'), ('人', 'O'), ('。', 'O'), ('中华民国大陆时期', 'GPE'), ('、', 'O'), ('中国共产党', 'ORG'), ('和', 'O'), ('中华人民共和国', 'GPE'), ('的', 'O'), ('重', 'O'), ('要', 'O'), ('政', 'O'), ('治', 'O'), ('家', 'O'), ('、', 'O'), ('经', 'O'), ('济', 'O'), ('家', 'O'), ('、', 'O'), ('军', 'O'), ('事', 'O'), ('家', 'O'), ('、', 'O'), ('战', 'O'), ('略', 'O'), ('家', 'O'), ('、', 'O'), ('外', 'O'), ('交', 'O'), ('家', 'O'), ('和', 'O'), ('诗', 'O'), ('人', 'O'), ('。', 'O')] >>> ner = Pororo(task="ner", lang="ja") >>> ner("豊臣 秀吉、または羽柴 秀吉は、戦国時代から安土桃山時代にかけての武将、大名。天下人、武家関白、太閤。三英傑の一人。") [('豊臣秀吉', 'PERSON'), ('、', 'O'), ('または', 'O'), ('羽柴秀吉', 'PERSON'), ('は', 'O'), ('、', 'O'), ('戦国時代', 'DATE'), ('から', 'O'), ('安土桃山時代', 'DATE'), ('にかけて', 'O'), ('の', 'O'), ('武将', 'O'), ('、', 'O'), ('大名', 'O'), ('。', 'O'), ('天下', 'O'), ('人', 'O'), ('、', 'O'), ('武家', 'O'), ('関白', 'O'), ('、太閤', 'O'), ('。', 'O'), ('三', 'O'), ('英', 'O'), ('傑', 'O'), ('の', 'O'), ('一', 'O'), ('人', 'O'), ('。', 'O')]
-
class
pororo.tasks.named_entity_recognition.
PororoBertNerEn
(model, config)[source]¶ Bases:
pororo.tasks.utils.base.PororoSimpleBase
-
class
pororo.tasks.named_entity_recognition.
PororoBertCharNer
(model, sent_tokenizer, device, config)[source]¶ Bases:
pororo.tasks.utils.base.PororoSimpleBase
-
class
pororo.tasks.named_entity_recognition.
PororoBertNerZh
(model, config)[source]¶ Bases:
pororo.tasks.utils.base.PororoSimpleBase