Show simple item record

 
dc.contributor.author Ingólfsdóttir, Svanhvít Lilja
dc.contributor.author Ragnarsson, Pétur Orri
dc.contributor.author Snæbjarnarson, Vésteinn
dc.date.accessioned 2022-01-26T10:18:35Z
dc.date.available 2022-01-26T10:18:35Z
dc.date.issued 2022-01-25
dc.identifier.uri http://hdl.handle.net/20.500.12537/183
dc.description The Icelandic Error Corpus (IEC) was used to fine tune the Icelandic language model IceBERT for sentence classification. The objective was to train grammatical error detection models that could classify whether a sentence contains a particular error type. The model can mark sentences as including one or more of the following issues: coherence, grammar, orthography, other, style and vocabulary. The overall F1 score is a modest 64%. --- Íslenska villumálheildin (IEC) var notuð til að fínþjálfa íslenska mállíkanið IceBERT fyrir flokkun á setningum. Markmiðið var að þjálfa líkan sem getur greint hvort setning innihaldi ákveðna villutegund. Líkanið getur merkt við setningar með einum eða fleiri mörkum af eftirfarandi: coherence, grammar, orthography, other, style og vocabulary. F1 yfir heildina er 64%.
dc.language.iso isl
dc.publisher Miðeind ehf
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.rights.label PUB
dc.subject iec
dc.subject ged
dc.subject grammatical error detection
dc.subject icelandic error corpus
dc.title Multilabel Error Classifier (Icelandic Error Corpus categories) for Sentences (22.01)
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding Clarin IS Repository
contact.person Vésteinn Snæbjarnarson vesteinn@mideind.is Miðeind ehf
sponsor Ministry of Education, Science and Culture Spell and grammar checking with neural networks (L14) Language Technology for Icelandic 2019-2023 nationalFunds
files.size 459882463
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Icon
Name
IceBERT-ged-sentence-supercats-multilabel.zip
Size
438.58 MB
Format
application/zip
Description
Unknown
MD5
d3dabf9a8285862fa3fd295f5f551996
 Download file  Preview
 File Preview  
  • IceBERT-ged-sentence-supercats-multilabel
    • README.md378 B
    • pytorch_model.bin474 MB
    • tokenizer_config.json1 kB
    • merges.txt581 kB
    • vocab.json912 kB
    • config.json1 kB
    • run_inference.py679 B
    • tokenizer.json1 MB
    • special_tokens_map.json772 B

Show simple item record