dc.contributor.author | Arnardóttir, Þórunn |
dc.contributor.author | Ingason, Anton Karl |
dc.date.accessioned | 2020-09-28T13:09:47Z |
dc.date.available | 2020-09-28T13:09:47Z |
dc.date.issued | 2020-09-28 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/63 |
dc.description | The Icelandic Error Corpus Nonwords is a list of Icelandic nonwords and their corrections, originating from the Icelandic Error Corpus. The word forms have been manually annotated as 'nonwords' and their corrections provided by proofreaders. The corpus contains a word form annotated as 'nonword' and its correction, using tab-separated values. If a word form has two or more corrections, they are shown in separate lines. Óorð íslensku villumálheildarinnar er listi af íslenskum óorðum og leiðréttingum þeirra, en óorðin eru fengin úr íslensku villumálheildinni IceErrorCorpus. Orðmyndirnar hafa verið handyfirfarnar og merktar með villuflokknum 'nonword', óorð, og leiðréttar af prófarkalesurum. Málheildin samanstendur af óorðunum og leiðréttingum þeirra í tveimur aðskildum dálkum. Ef óorð hefur tvær eða fleiri leiðréttingar er óorð auk einnar leiðréttingar sýnt í aðskildum línum. |
dc.language.iso | isl |
dc.publisher | University of Iceland |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.subject | nonwords |
dc.subject | error corpus |
dc.title | Icelandic Error Corpus Nonwords 20.09 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
contact.person | Þórunn Arnardóttir thar@hi.is University of Iceland |
sponsor | Ministry of Education, Science and Culture Word lists and language models (L4) Language Technology for Icelandic 2019-2023 nationalFunds |
size.info | 1323 words |
files.size | 33278 |
files.count | 1 |
Files in this item
This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- IEC_nonwords.tsv
- Size
- 32.5 KB
- Format
- Unknown
- Description
- A tab-separated file containing a 'nonword' word form and its correction
- MD5
- d66bf494e06d6084ba2c81df9cd460e3