What's New

 toolService 
toolService
Description:
Annotald is a program for annotating parsed corpora in the Penn Treebank format. For more information on the format (as instantiated by the Penn Parsed Corpora of Historical English), see the documentation by Beatrice ...
 This item contains 2 files (2.89 MB).
 
Publicly Available
 toolService 
toolService
Description:
Yfirlestur.is is a public website where you can enter or submit your Icelandic text and have it checked for spelling and grammar errors. The tool also gives hints on words and structures that might not be appropriate, ...
 This item contains 2 files (1.27 MB).
 
Publicly Available
 toolService 
toolService
Description:
This is a pipeline for creating GreynirSeq domain-aware translation models. A valid checkpoint of a base translation model based on mBART25 can be finetuned as a domain translation model. The resulting model can be queried ...
 This item contains 2 files (4.54 MB).
 
Publicly Available

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
Talrómur is a public domain speech corpus for text-to-speech research and development. The corpus consists of 122,417 short audio clips of eight different speakers reading short sentences. The audio was recorded in 2020 ...
 This item contains 11 files (19.99 GB).
 
Publicly Available
 corpus 
corpus
Description:
ParIce is an English-Icelandic parallel corpus. This is the first parallel corpus built for the purposes of language technology development and research for Icelandic. It includes 3.5 million translation segment pairs from ...
 This item contains 1 file (696.19 MB).
 
Publicly Available
 corpus 
corpus
Description:
The Icelandic Gigaword corpus (IGC) is a tagged and lemmatized corpus. The 20.05 version consists of approximately 1530 million running words of text. Each running word is accompanied by a morphosyntactic tag and lemma and ...
 This item contains 1 file (7.55 GB).
 
Publicly Available