dc.contributor.author | Arnardóttir, Þórunn |
dc.contributor.author | Ingason, Anton Karl |
dc.date.accessioned | 2020-09-14T11:19:04Z |
dc.date.available | 2020-09-14T11:19:04Z |
dc.date.issued | 2020-09-01 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/50 |
dc.description | A list of automatically prepared word forms containing systematic errors along with their corrections. All words are nonwords, i.e. they do not exist in Icelandic. All of the word forms are created by making substitutions commonly made by Icelandic informants, wherein a letter is replaced with another letter which sounds the same or does or does not have an accent. The list consists of the following systematic errors: Words wherein a letter lacks an accent Words wherein an accent is added to a letter Words which should contain ‘y’ but instead contain ‘i’ Words which should contain ‘i’ but instead contain ‘y’ Words which should contain ‘ý’ but instead contain ‘í’ Words which should contain ‘í’ but instead contain ‘ý’ Words which should contain ‘hv’ but instead contain ‘kv’ Words which should contain ‘kv’ but instead contain ‘hv’ The list was prepared using a word list from the DMII (The Database from Modern Icelandic Inflection) which includes different Icelandic words and their inflections. The list of systematic errors consists of two columns separated with a tab, the first column showing the erroneous word form and the second column its correction. |
dc.description | Listi af orðmyndum sem innihalda kerfisbundnar villur ásamt leiðréttingum þeirra. Allar orðmyndirnar voru búnar til með sjálfvirkum hætti og eru ekki samþykkt íslensk orð. Allar orðmyndirnar eru búnar til með því að skipta stöfum út fyrir annan, villur sem eru algengar meðal íslenskra málhafa. Þar er staf skipt út fyrir annan sem hljómar eins eða sem er eða er ekki broddstafur. Listinn samanstendur af eftirfarandi kerfisbundnum villum: Orð sem innihalda staf sem á vantar brodd Orð sem innihalda staf sem hefur aukabrodd Orð sem ættu að innihalda 'y' en innihalda 'i' í staðinn Orð sem ættu að innihalda 'i' en innihalda 'y' í staðinn Orð sem ættu að innihalda 'ý' en innihalda 'í' í staðinn Orð sem ættu að innihalda 'í' en innihalda 'ý' í staðinn Orð sem ættu að innihalda 'hv' en innihalda 'kv' í staðinn Orð sem ættu að innihalda 'kv' en innihalda 'hv' í staðinn Listinn var gerður með hjálp orðalista frá BÍN (Beyginarlýsingu íslensks nútímamáls) sem inniheldur mismunandi íslensk orð og beygingu þeirra. Listinn af kerfisbundnum villum samanstendur af tveimur dálkum þar sem sá fyrri sýnir röngu orðmyndina og sá seinni þá réttu. |
dc.language.iso | isl |
dc.publisher | University of Iceland |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.subject | word list |
dc.subject | errors |
dc.subject | spelling errors |
dc.subject | systematic nonwords |
dc.title | nonwords |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
contact.person | Þórunn Arnardóttir thar@hi.is University of Iceland |
sponsor | Ministry of Education, Science and Culture Word lists and language models (L4) Language Technology for Icelandic 2019-2023 nationalFunds |
size.info | 16155163 words |
files.size | 79209321 |
files.count | 1 |
Files in this item
This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- systematic_nonwords.zip
- Size
- 75.54 MB
- Format
- application/zip
- Description
- A zip file containing the word list
- MD5
- 9dafd78c36a8fff0666f47e5310dd21c
- systematic_nonwords
- cc-by-4-0.txt-1 B
- systematic_nonwords.tsv-1 B
- __MACOSX
- systematic_nonwords
- ._systematic_nonwords.tsv-1 B
- systematic_nonwords