| dc.contributor.author |
Guðjónsson, Ásmundur Alma |
| dc.contributor.author |
Loftsson, Hrafn |
| dc.contributor.author |
Daðason, Jón Friðrik |
| dc.date.accessioned |
2021-11-23T02:26:11Z |
| dc.date.available |
2021-11-23T02:26:11Z |
| dc.date.issued |
2021 |
| dc.identifier.uri |
http://hdl.handle.net/20.500.12537/159 |
| dc.description |
A dockerized Named Entity Recognition (NER) API for Icelandic. It uses a the IceBERT language model from Miðeind as its primary model, but it also offers the possibility to use 3 other transformer language models with it ( ELECTRA-base, convbert-small, and multilingual-BERT) and combines them with CombiTagger. They were all fine tuned for NER using MIM-GOLD-NER. IceBERT was the best individual model as it achieves F1-score of ~92.73 on the test set for MIM-GOLD-NER, while the combination of the four, in the form of CombiTagger, achieved F1-score of 93.21.
The code for the API is available at https://github.com/icelandic-lt/Icelandic-NER-API and the files for the fine tuned models are available in this submission.
Dockerútfærð forritaskil fyrir nafnakennsl (NER) á íslensku. Þau notast við IceBERT mállíkan frá Miðeind sem sitt megin líkan, en þau bjóða líka upp á möguleikann að láta IceBERT vinna með 3 öðrum líkönum (ELECTRA-base, convbert-small og multilingual-BERT). Þau hafa öll verið fínstillt fyrir NER með nafnakennslamálheildinni MIM-GOLD-NER. Ef við skoðum hvert líkan fyrir sig, þá er IceBERT líkanið best, en það nær 92.73 í F1, á meðn CombiTagger nær 93.21 í F1.
Forritunarkóðinn fyrir forritaskilinu eru aðgengileg hérna: https://github.com/icelandic-lt/Icelandic-NER-API og skrárnar fyrir fínstilltu líkönin má finna í þessari færslu. |
| dc.language.iso |
isl |
| dc.publisher |
Reykjavík University |
| dc.rights |
Apache License 2.0 |
| dc.rights.uri |
https://opensource.org/license/apache2-0-php/ |
| dc.rights.label |
PUB |
| dc.source.uri |
https://github.com/icelandic-lt/Icelandic-NER-API |
| dc.subject |
named entity recognition |
| dc.subject |
transformer |
| dc.subject |
webservice |
| dc.subject |
api |
| dc.title |
Icelandic NER API - Ensamble model (21.09) |
| dc.type |
toolService |
| metashare.ResourceInfo#ContentInfo.detailedType |
tool |
| metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
true |
| has.files |
yes |
| branding |
Clarin IS Repository |
| demo.uri |
https://electra-ner-icelandic-gwafmrdfha-ez.a.run.app |
| contact.person |
Ásmundur Alma Guðjónsson asmundurg@gmail.com Reykjavík University |
| sponsor |
Ministry of Education Science and Culture Support tools: Named Entity Recognition (I7) Language Technology for Icelandic 2019-2023 nationalFunds |
| files.size |
1602264849 |
| files.count |
1 |