dc.contributor.author | Ingason, Anton Karl |
dc.contributor.author | Rögnvaldsson, Eiríkur |
dc.contributor.author | Sigurðsson, Einar Freyr |
dc.contributor.author | Wallenberg, Joel C. |
dc.date.accessioned | 2020-12-04T14:06:16Z |
dc.date.available | 2020-12-04T14:06:16Z |
dc.date.issued | 2012-08-03 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/92 |
dc.description | The Faroese Parsed Historical Corpus (FarPaHC) is a manually corrected treebank, parsed according to the annotation guidelines of The Penn Parsed Corpora of Historical English (PPCHE) and The Icelandic Parsed Historical Corpus (IcePaHC), with minor modifications that are specific to Faroese. It consists of 53,000 words in three texts from the 19th and 20th century, all religious biblical texts. The file format is labeled bracketing as in the Penn Treebank with a UTF-8 encoding. The corpus is released under a CC BY 4.0 license. It was originally released in 2012 whereas it was uploaded to the CLARIN-IS repository in 2020. |
dc.description | Sögulegi færeyski trjábankinn (FarPaHC) er handleiðréttur trjábanki sem er greindur samkvæmt þáttunarskema sögulegu ensku Penn-trjábankanna (Penn Parsed Corpora of Historical English; PPCHE) og Sögulega íslenska trjábankans (IcePaHC), þó með nokkrum breytingum til samræmis við færeyska málfræði. Bankinn inniheldur 53.000 orð í þremur textum frá 19. og 20. öld sem allir eru trúarlegir biblíutextar. Skráarsniðið er svigasnið (e. labeled bracketing) eins og í Penn-trjábankanum og textinn er í UTF-8-stafasetti. Málheildinni er dreift með CC BY 4.0-leyfi. Hún var upphaflega gefin út 2012 en var sett á varðveislusvæði CLARIN-IS árið 2020. |
dc.language.iso | fao |
dc.publisher | University of Iceland |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.source.uri | https://github.com/einarfs/farpahc |
dc.subject | treebank |
dc.subject | parsing |
dc.subject | manual parsing |
dc.subject | phrase structure grammar |
dc.subject | parsed corpus |
dc.subject | historical corpus |
dc.subject | icepahc |
dc.title | Faroese Parsed Historical Corpus (FarPaHC) 0.1 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
contact.person | Einar Freyr Sigurðsson einarfs@gmail.com The Árni Magnússon Institute for Icelandic Studies |
size.info | 53000 words |
files.size | 519318 |
files.count | 1 |
Files in this item
This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- farpahc-v0.1.zip
- Size
- 507.15 KB
- Format
- application/zip
- Description
- a zip file containing both the parsed and the original files
- MD5
- 5dc722bdf60c20b997df9035846ffabb
- farpahc-v0.1
- tagged
- 1823.ntmatt.rel-bib.tagged-1 B
- 1936.ntjohn.rel-bib.tagged-1 B
- 1928.ntacts.rel-bib.tagged-1 B
- info
- 1928.ntacts.rel-bib.info-1 B
- 1936.ntjohn.rel-bib.info-1 B
- 1823.ntmatt.rel-bib.info-1 B
- README-1 B
- gpl.txt-1 B
- README~-1 B
- txt
- 1823.ntmatt.rel-bib.txt-1 B
- 1936.ntjohn.rel-bib.txt-1 B
- 1928.ntacts.rel-bib.txt-1 B
- .DS_Store-1 B
- cc-by-4.0.txt-1 B
- lgpl.txt-1 B
- psd
- 1823.ntmatt.rel-bib.psd-1 B
- 1936.ntjohn.rel-bib.psd-1 B
- 1928.ntacts.rel-bib.psd-1 B
- tagged
- __MACOSX
- farpahc-v0.1
- ._gpl.txt-1 B
- tagged
- ._1928.ntacts.rel-bib.tagged-1 B
- ._1823.ntmatt.rel-bib.tagged-1 B
- ._1936.ntjohn.rel-bib.tagged-1 B
- info
- ._1928.ntacts.rel-bib.info-1 B
- ._1936.ntjohn.rel-bib.info-1 B
- ._1823.ntmatt.rel-bib.info-1 B
- ._.DS_Store-1 B
- ._README~-1 B
- ._README-1 B
- txt
- ._1928.ntacts.rel-bib.txt-1 B
- ._1823.ntmatt.rel-bib.txt-1 B
- ._1936.ntjohn.rel-bib.txt-1 B
- ._cc-by-4.0.txt-1 B
- ._lgpl.txt-1 B
- psd
- ._1928.ntacts.rel-bib.psd-1 B
- ._1823.ntmatt.rel-bib.psd-1 B
- ._1936.ntjohn.rel-bib.psd-1 B
- ._farpahc-v0.1-1 B
- farpahc-v0.1