Sýna einfalda færslu atriðis

 
dc.contributor.author Loftsson, Hrafn
dc.contributor.author Yngvason, Jökull H.
dc.contributor.author Helgadóttir, Sigrún
dc.contributor.author Rögnvaldsson, Eiríkur
dc.date.accessioned 2020-06-15T10:04:31Z
dc.date.available 2020-06-15T10:04:31Z
dc.date.issued 2013
dc.identifier.uri http://hdl.handle.net/20.500.12537/43
dc.description The MIM-GOLD corpus version 0.9 consists of 13 files with tagged Icelandic text that has been sampled from 13 domains of texts of the 25 million word Tagged Icelandic Corpus (MIM). The texts were cleaned extensively and then run through an automatic tagging process. The tags were then semi-manually and manually corrected. The corpus is intended for the training of data-driven taggers for Icelandic. --------- Í útgáfu 0,9 af Gullstaðlinum eru 13 skrár með mörkuðum textum sem voru valdir með úrtaki úr 13 textaflokkum úr 25 milljón orða Markaðri íslenskri málheild (MIM, http://malfong.is/?pg=mim). Textarnir voru hreinsaðir og síðan markaðir með sjálfvirkum aðferðum og síðan var mörkun leiðrétt með hálfsjálfvirkum og handvirkum aðferðum. Gert er ráð fyrir að málheildin verði notuð sem gullstaðall fyrir þjálfun námfúsra markara.
dc.language.iso isl
dc.publisher The Árni Magnússon Institute for Icelandic Studies
dc.relation.isreferencedby http://www.ru.is/~hrafn/papers/corpusTagging.final.pdf
dc.relation.isreplacedby http://hdl.handle.net/20.500.12537/44
dc.rights Icelandic Mim Gold Standard for PoS Tagging
dc.rights.uri https://repository.clarin.is/repository/xmlui/page/license-mim-gold
dc.rights.label PUB
dc.source.uri http://www.malfong.is/index.php?lang=en&pg=gull
dc.subject text corpus
dc.subject gold standard
dc.subject pos-tagging
dc.subject morphosyntactic tagging
dc.subject lemmatization
dc.title MIM-GOLD 0.9
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
contact.person Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies
sponsor RANNÍS 090662011 Viable Language Technology beyond English – Icelandic as a test case nationalFunds
sponsor The Iclelandic Student Innovation Fund 903171091 Mörkun og leiðrétting nýrrar málheildar nationalFunds
size.info 972351 tokens
size.info 870632 words
size.info 57596 sentences
size.info 13 files
files.size 3661811
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Icelandic Mim Gold Standard for PoS Tagging
Icon
Name
MIM-GOLD-0_9.zip
Size
3.49 MB
Format
application/zip
Description
MIM-GOLD-0_9
MD5
8af617493d0bcd7c3380de767097f7ba
 Download file  Preview
 File Preview  
  • MIM-GOLD-0_9
    • skrar305 B
    • laws.txt471 kB
    • radio_tv_news.txt129 kB
    • mbl.txt2 MB
    • websites.txt370 kB
    • fbl.txt1 MB
    • school_essays.txt389 kB
    • blog.txt1 MB
    • webmedia.txt99 kB
    • scienceweb.txt1022 kB
    • written-to-be-spoken.txt212 kB
    • emails.txt60 kB
    • books.txt2 MB
    • adjucations.txt140 kB
    • userlicense_mim_gold_download_en.pdf526 kB
    • userlicense_mim_gold_download_is.pdf101 kB
    • readme_mim-gold-0_93 kB

Sýna einfalda færslu atriðis