| dc.contributor.author |
Loftsson, Hrafn |
| dc.contributor.author |
Yngvason, Jökull H. |
| dc.contributor.author |
Helgadóttir, Sigrún |
| dc.contributor.author |
Rögnvaldsson, Eiríkur |
| dc.contributor.author |
Barkarson, Starkaður |
| dc.contributor.author |
Valbjörnsdóttir, Steinunn |
| dc.contributor.author |
Sigurðsson, Kristján Friðbjörn |
| dc.contributor.author |
Stefánsdóttir, Brynhildur |
| dc.contributor.author |
Daðason, Jón Friðrik |
| dc.contributor.author |
|
| dc.date.accessioned |
2020-06-15T11:57:55Z |
| dc.date.available |
2020-06-15T11:57:55Z |
| dc.date.issued |
2018-09-06 |
| dc.identifier.uri |
http://hdl.handle.net/20.500.12537/45 |
| dc.description |
This package contains 10 sets of training/testing sets from version 1.0 of MIM-GOLD (http://hdl.handle.net/20.500.12537/44). Each training set contains about 90% of each of the 13 files of MIM-GOLD, the remianing 10% is contained in the corresponding test set. The testing sets do, therefore, not overlap but the training sets have about 80% in common.
--------
Í þessum pakka eru 10 pör af þjálfunar- og prófunarsöfnum sem eru búin til úr textum í útgáfu 1,0 af Gullstaðlinum (http://hdl.handle.net/20.500.12537/44). Í hverju þjálfunarsafni eru um 90% af hverri af 13 skrám Gullstaðalsins, þau 10% sem eftir eru fara í samsvarandi prófunarsafn. Prófunarsöfnin skarast því ekki en þjálfunarsöfnin hafa um 80% sameiginlega texta. |
| dc.language.iso |
isl |
| dc.publisher |
The Árni Magnússon Institute for Icelandic Studies |
| dc.relation.isreferencedby |
http://www.ru.is/~hrafn/papers/corpusTagging.final.pdf |
| dc.relation.isreplacedby |
http://hdl.handle.net/20.500.12537/40 |
| dc.rights |
Icelandic Mim Gold Standard for PoS Tagging |
| dc.rights.uri |
https://repository.clarin.is/repository/xmlui/page/license-mim-gold |
| dc.rights.label |
PUB |
| dc.source.uri |
http://www.malfong.is/index.php?lang=en&pg=gull |
| dc.subject |
text corpus |
| dc.subject |
gold standard |
| dc.subject |
pos-tagging |
| dc.subject |
morphosyntactic tagging |
| dc.subject |
lemmatization |
| dc.subject |
training sets |
| dc.subject |
test sets |
| dc.title |
MIM-GOLD 1.0 - training and testing sets |
| dc.type |
corpus |
| metashare.ResourceInfo#ContentInfo.mediaType |
text |
| has.files |
yes |
| branding |
Clarin IS Repository |
| contact.person |
Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies |
| sponsor |
The Iclelandic Student Innovation Fund 903171091 Mörkun og leiðrétting nýrrar málheildar nationalFunds |
| sponsor |
Icelandic Research Fund (RANNÍS) 090662011 Viable Language Technology beyond English – Icelandic as a test case nationalFunds |
| sponsor |
The Iclelandic Student Innovation Fund 104540000 Íslensk staðalmálheild nationalFunds |
| size.info |
1005688 tokens |
| size.info |
901198 words |
| size.info |
58765 sentences |
| size.info |
20 files |
| files.size |
31585532 |
| files.count |
1 |