Show simple item record

 
dc.contributor.author Jónsson, Haukur
dc.contributor.author Loftsson, Hrafn
dc.date.accessioned 2021-09-28T11:28:47Z
dc.date.available 2021-09-28T11:28:47Z
dc.date.issued 2021-10-01
dc.identifier.uri http://hdl.handle.net/20.500.12537/134
dc.description A neural Lemmatizer for Icelandic. In this submission, you will find a pretrained lemmatizer model for ABLTagger v3.1.0. In this submission we provide a small lemmatizer that accepts as input the tokens and tags from the revised tagset. The lemmatizer achieves an accuracy of 98.3% on MIM-Gold (21.05, cross-validation). Það er minni nákvæmni en Nefnir. For installation, usage, and other instructions see https://github.com/cadia-lvl/POS/releases/tag/m6 You should also check if a newer version is out (see README.md - versions) on CLARIN: - Model files ------------------------------------------------------------------------------------------- Lemmari fyrir íslensku. Í þessum pakka er forþjálfað lemmunar líkan fyrir ABLTagger v3.1.0. Í þessari útgáfu er lítill lemmari sem tekur inn tóka og mörk úr nýja markamengið. Lemmarinn nær 98.3% nákvæmni á MÍM-Gull (21.05, krossprófanir). Það er minni nákvæmni en Nefnir. Fyrir uppsetningar-, notenda- og aðrar leiðbeiningar sjá https://github.com/cadia-lvl/POS/releases/tag/m6 Einnig er gott að athuga þar hvort ný útgáfa sé komin út (sjá README.md - versions) Á CLARIN: - Gögn fyrir líkan
dc.language.iso isl
dc.publisher Reykjavik University
dc.rights The MIT License (MIT)
dc.rights.uri https://opensource.org/licenses/mit-license.php
dc.rights.label PUB
dc.source.uri https://github.com/cadia-lvl/POS
dc.subject lemmatizer
dc.title ABLTagger (Lemmatizer) - 3.1.0
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding Clarin IS Repository
contact.person Haukur Jónsson haukurpalljonsson@gmail.com Reykjavik University
sponsor Ministry of Education, Science and Culture Support tools: Part-of-speech tagger (I4) Language Technology for Icelandic 2019-2023 nationalFunds
files.size 66548124
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
The MIT License (MIT)
Icon
Name
lemmatizer.tar.gz
Size
63.47 MB
Format
application/gzip
Description
The lemmatizer model files.
MD5
b6d93ec4665b184d460eac45f5e565c0
 Download file  Preview
 File Preview  
    • tokenizer_config.json73 B
    • model.pt67 MB
    • config.json379 B
    • vocab.txt254 kB
    • known_lemmas.txt567 kB
    • dictionaries.pickle14 kB
    • special_tokens_map.json112 B
    • hyperparamters.json1 kB
    • known_toks.txt1 MB

Show simple item record