dc.contributor.author |
Barkarson, Starkaður |
dc.date.accessioned |
2025-03-05T14:19:40Z |
dc.date.available |
2025-03-05T14:19:40Z |
dc.date.issued |
2025-03-05 |
dc.identifier.uri |
http://hdl.handle.net/20.500.12537/363 |
dc.description |
ENGLISH:
This package contains questions and answers from the corpus 'Texts from the Icelandic Web of Science and the European Web' (http://hdl.handle.net/20.500.12537/361) in a jsonl format, which is suitable for LLM training. The dataset is also available at Huggingface: https://huggingface.co/datasets/arnastofnun/VV_EV. |
dc.description |
ÍSLENSKA:
Pakkinn inniheldur spurningar og svör ú málheildinni 'Textar af Vísindavefnum og Evrópuvefnum' (http://hdl.handle.net/20.500.12537/361) á jsonl-sniðmáti sem hentar m.a. við þjálfun mállíkana. Gagnasettið er einnig aðgengilegt á Huggingface: https://huggingface.co/datasets/arnastofnun/VV_EV. |
dc.language.iso |
isl |
dc.publisher |
The Árni Magnússon Institute for Icelandic Studies |
dc.rights |
Icelandic Gigaword Corpus |
dc.rights.uri |
https://repository.clarin.is/repository/xmlui/page/license-gigaword-corpus |
dc.rights.label |
PUB |
dc.subject |
json |
dc.subject |
jsonl |
dc.subject |
scientific |
dc.subject |
Q&A |
dc.subject |
questions |
dc.subject |
answers |
dc.title |
Texts from the Icelandic Web of Science and the European Web - unannotated version 25.01 - JSONL format |
dc.type |
corpus |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
has.files |
yes |
branding |
Clarin IS Repository |
contact.person |
Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies |
size.info |
11431 entries |
size.info |
257128 sentences |
files.size |
14926424 |
files.count |
2 |