What's New
corpus
Description:
This release of data from the Samrómur collection focuses on utterances from people where Icelandic is not their native tongue. It contains 36,891 (50.1 hours) automatically verified speech-recordings in Icelandic. The ...
This item contains 3 files (2.46
GB).
Publicly Available
corpus
Description:
The Icelandic Standardization Benchmark Set: Spelling and Punctuation (IceStaBS:SP) consists of examples of written text that deviate from the standard with respect to spelling and punctuation, along with explained corrections ...
This item contains 1 file (62.23
KB).
Publicly Available
corpus
Description:
This package contains a benchmarking data set to evaluate the grammatical knowledge and linguistic ability of Large Language Models (LLMs) for Icelandic. It is meant to help LLM developers to improve their model's Icelandic ...
This item contains 1 file (29.99
KB).
Publicly Available