What's New

 lexicalConceptualResource 
lexicalConceptualResource
Description:
The total list of stop words includes 59.664 words or non-words that were handpicked from the Icelandic Gigaword Corpus. The sublists are as follows: - 6.576 abbreviations. - 27.144 foreign words (especially proper ...
 This item contains 1 file (1000.77 KB).
 
Publicly Available
 toolService 
toolService
Description:
ALEXIA is a command-line based corpus tool used for comparing a certain vocabulary to that of a larger corpus or corpora. In order to maintain lexicons, dictionaries and terminologies, it is necessary to be able to ...
 This item contains 1 file (775.63 KB).
 
Publicly Available
 toolService 
toolService
Description:
IceParser is a shallow parser for Icelandic. The parser comprises a sequence of finite-state transducers, which add syntactic information, in an incremental manner, into the input text. The input to IceParser is part-of-speech ...
 This item contains 1 file (62.68 MB).
 
Publicly Available

Most Viewed Items

Top Last Week
 toolService 
toolService
Description:
English This archive contains files generated from the recipe in kaldi-speaker-diarization/v5/. Its contents should be placed in a similar directory type, with symbolic links to diarization/, sid/, steps/, etc. It was ...
 This item contains 1 file (25.03 MB).
 
Publicly Available
 corpus 
corpus
Description:
Talrómur is a public domain speech corpus for text-to-speech research and development. The corpus consists of 122,417 short audio clips of eight different speakers reading short sentences. The audio was recorded in 2020 ...
 This item contains 11 files (19.99 GB).
 
Publicly Available
 corpus 
corpus
Description:
This Icelandic named entity (NE) corpus, MIM-GOLD-NER, is a version of the MIM-GOLD corpus tagged for NEs. Over 48 thousand NEs are tagged in this corpus of one million tokens, which can be used for training named entity ...
 This item contains 13 files (8.53 MB).
 
Publicly Available