Sýna einfalda færslu atriðis
                     
				
| dc.contributor.author | 
Helgadóttir, Sigrún | 
| dc.contributor.author | 
Barkarson, Starkaður | 
| dc.contributor.author | 
Hafsteinsdóttir, Hildur | 
| dc.contributor.author | 
Andrésdóttir, Þórdís Dröfn | 
| dc.date.accessioned | 
2020-06-11T13:56:48Z | 
| dc.date.available | 
2020-06-11T13:56:48Z | 
| dc.date.issued | 
2020-05-31 | 
| dc.identifier.uri | 
http://hdl.handle.net/20.500.12537/38 | 
| dc.description | 
Testing and training sets for pos-tagging from IFD 2020.05 (Icelandic Frequency Dictionary) which contains fragments from 100 texts, published between the years 1980 and 1989.
The testing and training pairs were created in such a way that all the 100 texts that constitute the corpus were divided into ten roughly equal parts. Each of these ten parts forms one test set and a corresponding training set contains the other nine parts.
The pos-tags were mapped to Tagset MIM-GOLD 2.0 (see discussion in http://hdl.handle.net/20.500.12537/26).
----------------
Þjálfunar- og prófunarsafn fyrir málfræðilega mörkun sem unnin voru upp úr Orðtíðinibókinni (2020.05) en hún inniheldur brot úr 100 textum sem gefnir voru út á árunum 1980 til 1989.
Pörin voru búin til þannig að hverri skrá var skipt upp í tíu nokkurn veginn jafna hluta. Hver þessara tíu hluta myndar eitt prófunarsafn og samstætt þjálfunarsafn hefur að geyma hina hlutana níu í hvert sinn. 
Mörkunum var varpað yfir á nýtt markamengi, MIM-GULL 2.0 (sjá umfjöllun í http://hdl.handle.net/20.500.12537/26). | 
| dc.language.iso | 
isl | 
| dc.publisher | 
The Árni Magnússon Institute for Icelandic Studies | 
| dc.rights | 
Icelandic Frequency Dictonary | 
| dc.rights.uri | 
https://repository.clarin.is/repository/xmlui/page/license-frequency-dictionary | 
| dc.rights.label | 
PUB | 
| dc.source.uri | 
http://www.malfong.is/index.php?lang=is&pg=ordtidnibok | 
| dc.subject | 
test sets | 
| dc.subject | 
training sets | 
| dc.subject | 
lemmatized | 
| dc.subject | 
pos-tagged | 
| dc.title | 
Icelandic Frequency Dictionary 2020.05 - training/testing sets | 
| dc.type | 
corpus | 
| metashare.ResourceInfo#ContentInfo.mediaType | 
text | 
| has.files | 
yes | 
| branding | 
Clarin IS Repository | 
| contact.person | 
Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies | 
| sponsor | 
Ministry of Education, Science and Culture A Gold Standard for PoS Tagging (G10) Language Technology for Icelandic 2019-2023 nationalFunds  | 
| size.info | 
589771 tokens | 
| size.info | 
518652 words | 
| size.info | 
37181 sentences | 
| files.size | 
17820840 | 
| files.count | 
1 | 
			 
		
 Files in this item
This item is 
Publicly Available
 and licensed under:
Icelandic Frequency Dictonary
 
- Name
 
- IFD3_SETS.zip
 
- Size
 
- 17
MB
 
- Format
 
- application/zip
 
- Description
 
- IFD3_SETS
 
- MD5
 
- 3b61859bbff1ac16d163f14624563d58
 
 Download file
 Preview
- 09PM.plain626 kB
 - 08TM.plain5 MB
 - 08PM.plain627 kB
 - 07TM.plain5 MB
 - 06PM.plain627 kB
 - 10PM.plain619 kB
 - 04PM.plain626 kB
 - 05TM.plain5 MB
 - 03TM.plain5 MB
 - 01TM.plain5 MB
 - 01PM.plain635 kB
 - README.txt508 B
 - 09TM.plain5 MB
 - 07PM.plain627 kB
 - 05PM.plain626 kB
 - 06TM.plain5 MB
 - 10TM.plain5 MB
 - 04TM.plain5 MB
 - 03PM.plain625 kB
 - 02TM.plain5 MB
 - 02PM.plain625 kB
 
 
 
 
 
 
 
Sýna einfalda færslu atriðis