Show simple item record

 
dc.contributor.author Helgadóttir, Sigrún
dc.date.accessioned 2020-06-11T13:50:45Z
dc.date.available 2020-06-11T13:50:45Z
dc.date.issued 2018-10-24
dc.identifier.uri http://hdl.handle.net/20.500.12537/37
dc.description Testing and training sets for pos-tagging from IFD 2018.10 (Icelandic Frequency Dictionary) which contains fragments from 100 texts, published between the years 1980 and 1989. The testing and training pairs were created in such a way that all the 100 texts that constitute the corpus were divided into ten roughly equal parts. Each of these ten parts forms one test set and a corresponding training set contains the other nine parts. ---------------- Þjálfunar- og prófunarsafn fyrir málfræðilega mörkun sem unnin voru upp úr Orðtíðinibókinni (2018.10) en hún inniheldur brot úr 100 textum sem gefnir voru út á árunum 1980 til 1989. Pörin voru búin til þannig að hverri skrá var skipt upp í tíu nokkurn veginn jafna hluta. Hver þessara tíu hluta myndar eitt prófunarsafn og samstætt þjálfunarsafn hefur að geyma hina hlutana níu í hvert sinn.
dc.language.iso isl
dc.publisher The Árni Magnússon Institute for Icelandic Studies
dc.relation.replaces http://hdl.handle.net/20.500.12537/34
dc.relation.isreplacedby http://hdl.handle.net/20.500.12537/38
dc.rights Icelandic Frequency Dictonary
dc.rights.uri https://repository.clarin.is/repository/xmlui/page/license-frequency-dictionary
dc.rights.label PUB
dc.source.uri http://www.malfong.is/index.php?lang=en&pg=ordtidnibok
dc.subject test sets
dc.subject training sets
dc.subject lemmatized
dc.subject pos-tagged
dc.title Icelandic Frequency Dictionary 2018.10 - training/testing sets
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
contact.person Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies
size.info 20 files
size.info 590299 tokens
size.info 519180 words
size.info 36912 sentences
files.size 17877421
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Icelandic Frequency Dictonary
Icon
Name
IFD2_SETS.zip
Size
17.05 MB
Format
application/zip
Description
IFD2_SETS
MD5
f02f24da7204b54ad67352d4351333db
 Download file  Preview
 File Preview  
  • IFD2_SETS
    • 07TM.txt5 MB
    • 06TM.txt5 MB
    • 05TM.txt5 MB
    • 05PM.txt620 kB
    • 04PM.txt621 kB
    • 03PM.txt620 kB
    • 02TM.txt5 MB
    • 01TM.txt5 MB
    • 09TM.txt5 MB
    • 08TM.txt5 MB
    • 10TM.txt5 MB
    • 07PM.txt621 kB
    • 06PM.txt622 kB
    • README.txt149 B
    • 04TM.txt5 MB
    • 03TM.txt5 MB
    • 02PM.txt619 kB
    • 01PM.txt630 kB
    • 09PM.txt621 kB
    • 08PM.txt622 kB
    • 10PM.txt613 kB

Show simple item record