Sýna einfalda færslu atriðis

 
dc.contributor.author Ingólfsdóttir, Svanhvít Lilja
dc.contributor.author Snæbjarnarson, Vésteinn
dc.date.accessioned 2022-06-01T09:11:58Z
dc.date.available 2022-06-01T09:11:58Z
dc.date.issued 2022-05-31
dc.identifier.uri http://hdl.handle.net/20.500.12537/217
dc.description The Icelandic Error Corpus (http://hdl.handle.net/20.500.12537/73) was used to fine tune the Icelandic language model IceBERT-xlmr-ic3 for token classification. The objective was to train grammatical error detection models that could classify whether a token range contains a particular error type. The model can mark tokens as including one of the following issue categories: coherence, grammar, orthography, other, style and vocabulary. The overall F1 score is 71 and for individual categories as follows: coherence: 0; grammar: 63; orthography: 86; other: 0; vocabulary: 15.2.
dc.description Íslenska villumálheildin (http://hdl.handle.net/20.500.12537/73) var notuð til að fínþjálfa íslenska mállíkanið IceBERT-xlmr-ic3 fyrir flokkun á tókum/orðum. Markmiðið var að þjálfa líkan sem getur greint hvort orð innihaldi ákveðna villutegund. Líkanið getur merkt við orð með einu af eftirfarandi mörkum: coherence, grammar, orthography, other, style og vocabulary. F1 yfir heildina er 71.
dc.language.iso isl
dc.publisher Miðeind ehf
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.subject ged
dc.subject grammatical error detection
dc.title Error Classifier (Icelandic Error Corpus categories) for Tokens (22.05)
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding Clarin IS Repository
contact.person Svanhvít Lilja Ingólfsdóttir svanhvit@mideind.is Miðeind ehf
sponsor Ministry of Education, Science and Culture Spell and grammar checking with neural networks (L14) Language Technology for Icelandic 2019-2023 nationalFunds
files.size 853250188
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Icon
Name
icebert-xlmr-ic3-iec.tar.gz
Size
813.72 MB
Format
application/gzip
Description
transformers model checkpoint
MD5
626e94e7d34f65f67b6dd44160c71bd0
 Download file  Preview
 File Preview  
  • icebert-xlmr-ic3-iec
    • config.json1 kB
    • training_args.bin2 kB
    • unigram.json14 MB
    • special_tokens_map.json239 B
    • tokenizer_config.json398 B
    • tokenizer.json8 MB
    • pytorch_model.bin1 GB
    • trainer_state.json6 kB

Sýna einfalda færslu atriðis