What's New

 corpus 
corpus
Description:
The Icelandic Gigaword corpus (IGC) is a tagged and lemmatized corpus. Version 1 (2017) consists of approximately 1,250 million running words of text. Each running word is accompanied by a morphosyntactic tag and lemma and ...
 This item contains 1 file (5.63 GB).
 
Publicly Available
 corpus 
corpus
Description:
The Icelandic Gigaword corpus (IGC) is a tagged and lemmatized corpus. Version 2 (2018) consists of approximately 1,400 million running words of text (punctation marks not included). Each running word is accompanied by a ...
 This item contains 1 file (6.31 GB).
 
Publicly Available

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
The Icelandic Parsed Historical Corpus (IcePaHC) is a manually corrected treebank, parsed according to the annotation guidelines of The Penn Parsed Corpora of Historical English (PPCHE), with minor modifications that are ...
 This item contains 1 file (11.91 MB).
 
Publicly Available
 corpus 
corpus
Description:
The Icelandic Contemporary Corpus (IceConTree) is a machine-parsed treebank parsed according to the IcePaHC annotation scheme. It consists of texts from the Icelandic Gigaword Corpus, parsed using the IceNeuralParsingPipeline. ...
 This item contains 1 file (3.92 GB).
 
Publicly Available
 corpus 
corpus
Description:
ParIce is an English-Icelandic parallel corpus. This is the first parallel corpus built for the purposes of language technology development and research for Icelandic. It includes 3.5 million translation segment pairs from ...
 This item contains 1 file (696.19 MB).
 
Publicly Available