What's New

 toolService 
toolService
Description:
A Part-of-Speech (PoS) tagger for Icelandic. In this submission, you will find ABLTagger v1.0.0. This is a PoS tagger that works with the revised tagset and achieves an accuracy of 95.59% on MIM-Gold (cross-validation). ...
 This item contains 5 files (8.58 GB).
 
Publicly Available
 toolService 
toolService
Description:
A python package that punctuates Icelandic text. The input data is unpunctuated text and punctuated text is returned. The user can choose between two punctuation models, a BERT-based Transformer and a bidirectional RNN ...
 This item contains 8 files (159.03 MB).
 
Publicly Available
 corpus 
corpus
Description:
The evaluation set contains 101.261 tokens and is divided into nine subcorpora: adjudications, books, educational websites, legal tests, news, opinions, parliamentary speeches and, sport news and radio and tv news scripts. ...
 This item contains 1 file (321.86 KB).
 
Publicly Available

Most Viewed Items

Top Last Week
 corpus 
corpus
Description:
This Icelandic named entity (NE) corpus, MIM-GOLD-NER, is a version of the MIM-GOLD corpus tagged for NEs. Over 48 thousand NEs are tagged in this corpus of one million tokens, which can be used for training named entity ...
 This item contains 13 files (8.53 MB).
 
Publicly Available
 toolService 
toolService
Description:
A Part-of-Speech (PoS) tagger for Icelandic. In this submission, you will find ABLTagger v1.0.0. This is a PoS tagger that works with the revised tagset and achieves an accuracy of 95.59% on MIM-Gold (cross-validation). ...
 This item contains 5 files (8.58 GB).
 
Publicly Available
 corpus 
corpus
Description:
The pronunciation of each entry in the ISLEX-dictionary is given as a sound file. There are nearly 49,000 sound files plus 700 phrases. The recordings were made in September 2012 at the studio Upptekið in Reykjavík. The ...
 This item contains 3 files (9.31 GB).