Show simple item record

 
dc.contributor.author Barkarson, Starkaður
dc.contributor.author Steingrímsson, Steinþór
dc.date.accessioned 2021-06-02T12:14:59Z
dc.date.available 2021-06-02T12:14:59Z
dc.date.issued 2021-05-31
dc.identifier.uri http://hdl.handle.net/20.500.12537/111
dc.description [ENGLISH] IGC-Parla is a part of the IGC-project (Icelandic Gigaword corpus) that aims to collect as much as possible of Icelandic texts that can be published under an open or restricted license. IGC-Parla contains parliamentary speeches that have been encoded according to the Parla-CLARIN recommendations The corpus comes in two formats. One contains the texts untokenized and untagged while the other has been tokenized, POS-tagged and lemmatized. [ICELANDIC] IGC-Parla er hluti af IGC-verkefninu (Íslenska risamálheildin - Icelandic Gigaword corpus) sem hefur að markmiði að safna eins miklum texta og mögulegt er sem gefa má út með opnu eða takmörkuðu leyfi. IGC-Parla inniheldur þingræður sem fluttar voru á Alþingi og birtar á vefsíðunni www.althingi.is, og hafa þær verið kóðaðar í samræmi við tillögur Parla-CLARIN. Málheildin er tvískipt. Annar hluti hennar inniheldur skjöl með hreinum texta, án þess að hann hafi verið tókaður. Hinn hlutinn inniheldur textan tókaðan, markaðan og lemmaðan.
dc.language.iso isl
dc.publisher The Árni Magnússon Institue for Icelandic Studies
dc.relation.isreferencedby https://www.aclweb.org/anthology/L18-1690.pdf
dc.relation.isreplacedby http://hdl.handle.net/20.500.12537/179
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.source.uri http://igc.arnastofnun.is
dc.subject corpora
dc.subject parliament speeches
dc.subject pos-tagged
dc.subject lemmatized
dc.subject tei
dc.title IGC-Parla-21.05 (The Icelandic Gigaword Corpus: Parliamentary speeches)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
demo.uri https://malheildir.arnastofnun.is
contact.person Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institue for Icelandic Studies
sponsor Ministry of Education, Science and Culture (Mennta- og menningamálaráðuneytið) Language Technology for Icelandic 2019-2023 The Icelandic Gigaword Corpus (G1) nationalFunds
size.info 468297 utterances
size.info 234062485 tokens
size.info 212873555 words
size.info 10283804 sentences
files.size 3142104934
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Icon
Name
IGC-Parla-21.05
Size
2.93 GB
Format
Unknown
Description
IGC-Parla-21.05
MD5
4890e9f0188ef72e36bcdac2efc0a9ac
 Download file

Show simple item record