dc.contributor.author | Barkarson, Starkaður |
dc.contributor.author | Steingrímsson, Steinþór |
dc.date.accessioned | 2021-06-02T12:14:59Z |
dc.date.available | 2021-06-02T12:14:59Z |
dc.date.issued | 2021-05-31 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/111 |
dc.description | [ENGLISH] IGC-Parla is a part of the IGC-project (Icelandic Gigaword corpus) that aims to collect as much as possible of Icelandic texts that can be published under an open or restricted license. IGC-Parla contains parliamentary speeches that have been encoded according to the Parla-CLARIN recommendations The corpus comes in two formats. One contains the texts untokenized and untagged while the other has been tokenized, POS-tagged and lemmatized. [ICELANDIC] IGC-Parla er hluti af IGC-verkefninu (Íslenska risamálheildin - Icelandic Gigaword corpus) sem hefur að markmiði að safna eins miklum texta og mögulegt er sem gefa má út með opnu eða takmörkuðu leyfi. IGC-Parla inniheldur þingræður sem fluttar voru á Alþingi og birtar á vefsíðunni www.althingi.is, og hafa þær verið kóðaðar í samræmi við tillögur Parla-CLARIN. Málheildin er tvískipt. Annar hluti hennar inniheldur skjöl með hreinum texta, án þess að hann hafi verið tókaður. Hinn hlutinn inniheldur textan tókaðan, markaðan og lemmaðan. |
dc.language.iso | isl |
dc.publisher | The Árni Magnússon Institue for Icelandic Studies |
dc.relation.isreferencedby | https://www.aclweb.org/anthology/L18-1690.pdf |
dc.relation.isreplacedby | http://hdl.handle.net/20.500.12537/179 |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.source.uri | http://igc.arnastofnun.is |
dc.subject | corpora |
dc.subject | parliament speeches |
dc.subject | pos-tagged |
dc.subject | lemmatized |
dc.subject | tei |
dc.title | IGC-Parla-21.05 (The Icelandic Gigaword Corpus: Parliamentary speeches) |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
demo.uri | https://malheildir.arnastofnun.is |
contact.person | Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institue for Icelandic Studies |
sponsor | Ministry of Education, Science and Culture (Mennta- og menningamálaráðuneytið) Language Technology for Icelandic 2019-2023 The Icelandic Gigaword Corpus (G1) nationalFunds |
size.info | 468297 utterances |
size.info | 234062485 tokens |
size.info | 212873555 words |
size.info | 10283804 sentences |
files.size | 3142104934 |
files.count | 1 |
Files in this item
This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- IGC-Parla-21.05
- Size
- 2.93 GB
- Format
- Unknown
- Description
- IGC-Parla-21.05
- MD5
- 4890e9f0188ef72e36bcdac2efc0a9ac