Sýna einfalda færslu atriðis
dc.contributor.author |
Jónsson, Haukur |
dc.contributor.author |
Loftsson, Hrafn |
dc.date.accessioned |
2021-09-28T11:28:47Z |
dc.date.available |
2021-09-28T11:28:47Z |
dc.date.issued |
2021-10-01 |
dc.identifier.uri |
http://hdl.handle.net/20.500.12537/134 |
dc.description |
A neural Lemmatizer for Icelandic.
In this submission, you will find a pretrained lemmatizer model for ABLTagger v3.1.0. In this submission we provide a small lemmatizer that accepts as input the tokens and tags from the revised tagset. The lemmatizer achieves an accuracy of 98.3% on MIM-Gold (21.05, cross-validation). Það er minni nákvæmni en Nefnir.
For installation, usage, and other instructions see https://github.com/cadia-lvl/POS/releases/tag/m6
You should also check if a newer version is out (see README.md - versions)
on CLARIN:
- Model files
-------------------------------------------------------------------------------------------
Lemmari fyrir íslensku.
Í þessum pakka er forþjálfað lemmunar líkan fyrir ABLTagger v3.1.0. Í þessari útgáfu er lítill lemmari sem tekur inn tóka og mörk úr nýja markamengið. Lemmarinn nær 98.3% nákvæmni á MÍM-Gull (21.05, krossprófanir). Það er minni nákvæmni en Nefnir.
Fyrir uppsetningar-, notenda- og aðrar leiðbeiningar sjá https://github.com/cadia-lvl/POS/releases/tag/m6
Einnig er gott að athuga þar hvort ný útgáfa sé komin út (sjá README.md - versions)
Á CLARIN:
- Gögn fyrir líkan |
dc.language.iso |
isl |
dc.publisher |
Reykjavik University |
dc.rights |
The MIT License (MIT) |
dc.rights.uri |
https://opensource.org/licenses/mit-license.php |
dc.rights.label |
PUB |
dc.source.uri |
https://github.com/cadia-lvl/POS |
dc.subject |
lemmatizer |
dc.title |
ABLTagger (Lemmatizer) - 3.1.0 |
dc.type |
toolService |
metashare.ResourceInfo#ContentInfo.detailedType |
tool |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
true |
has.files |
yes |
branding |
Clarin IS Repository |
contact.person |
Haukur Jónsson haukurpalljonsson@gmail.com Reykjavik University |
sponsor |
Ministry of Education, Science and Culture Support tools: Part-of-speech tagger (I4) Language Technology for Icelandic 2019-2023 nationalFunds |
files.size |
66548124 |
files.count |
1 |
Files in this item
This item is
Publicly Available
and licensed under:
The MIT License (MIT)
- Name
- lemmatizer.tar.gz
- Size
- 63.47
MB
- Format
- application/gzip
- Description
- The lemmatizer model files.
- MD5
- b6d93ec4665b184d460eac45f5e565c0
Download file
Preview
- tokenizer_config.json73 B
- model.pt67 MB
- config.json379 B
- vocab.txt254 kB
- known_lemmas.txt567 kB
- dictionaries.pickle14 kB
- special_tokens_map.json112 B
- hyperparamters.json1 kB
- known_toks.txt1 MB
Sýna einfalda færslu atriðis