dc.contributor.author | Snæbjarnarson, Vésteinn |
dc.contributor.author | Einarsson, Bergur Tareq Tamimi |
dc.contributor.author | Auðunardóttir, Ingibjörg Iða |
dc.contributor.author | Sæmundsson, Unnar Ingi |
dc.contributor.author | Bjarnadóttir, Hildur |
dc.contributor.author | Gunnarsson, Helgi Valur |
dc.contributor.author | Einarsson, Hafsteinn |
dc.date.accessioned | 2021-09-30T09:42:35Z |
dc.date.available | 2021-09-30T09:42:35Z |
dc.date.issued | 2021-09-16 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/143 |
dc.description | NQiI - SQuAD format v. 1.0 The Natural Questions in Icelandic (NQiI) is a question answering dataset suitable for both reading comprehension style and open-domain question answering in Icelandic. The answer context is from the Icelandic Wikipedia. Included here is a subset of the NQiI dataset in the same format as the Stanford Question Answering Dataset (SQuAD). The data is provided both tokenized and not tokenized. The data is included in json format and jsonl format (one json object per line). The dataset is further described in Vésteinn Snæbjarnarson, 2021, Automated methods for Question-Answering in Icelandic, M.Sc. thesis, Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland. --- Natural Questions in Icelandic (NQiI) er spurningasvörunar gagnasett sem hæfir bæði lesskilnings- og opinni spurningasvörun á Íslensku. Svörin eru merkt inn í íslenska hluta Wikipedia. Í þessari hirslu er hlutmengi NQiI sem er á sama sniði og Stanford Question Answering Dataset (SQuAD). Gögnin eru bæði aðgengileg tókuð og ótókuð. Snið gagnanna er json og jsonl. Málheildinni er frekar lýst í Vésteinn Snæbjarnarson, 2021, Automated methods for Question-Answering in Icelandic, M.Sc. thesis, Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland. |
dc.language.iso | isl |
dc.publisher | Háskóli Íslands |
dc.relation.isreferencedby | http://hdl.handle.net/1946/39966 |
dc.relation.isreplacedby | http://hdl.handle.net/20.500.12537/188 |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.source.uri | https://vesteinn.is/qa |
dc.subject | question answering |
dc.subject | reading comprehension |
dc.title | NQiI - Natural Questions In Icelandic - v1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
contact.person | Vésteinn Snæbjarnarson vesteinnsnaebjarnarson@gmail.com Háskóli Íslands |
sponsor | Rannís Nýsköpunarsjóður námsmanna 2020 Málheild fyrir almenna spurningasvörun á íslensku Other |
size.info | 5681 entries |
files.size | 21817891 |
files.count | 13 |
Files in this item
Download all files in item (20.81 MB)This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- nqii_dev_squad_format.json
- Size
- 483.66 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 5094d96e336aaf5a64643e4ba45ed0b1
- Name
- nqii_dev_squad_format_tok.json
- Size
- 566.62 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- cab77356251e71c2b82b06c799e39a30
- Name
- nqii_dev_squad_line_format.json
- Size
- 439.23 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- b54d0f295c2e953c9f927f0be7c994cb
- Name
- nqii_dev_squad_line_format_tok.json
- Size
- 497.79 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- fc34b20f7f71e9de28bad9804f9e49ab
- Name
- README
- Size
- 586 bytes
- Format
- Unknown
- Description
- Unknown
- MD5
- ba26897ab65525226ab4f5d1166f73b4
- Name
- nqii_test_squad_format_tok.json
- Size
- 524.81 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- da4540e691d8909035a78b695dff2c6a
- Name
- nqii_test_squad_line_format.json
- Size
- 549.65 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 2aac926995b5db47b9d9e33081990ea8
- Name
- nqii_test_squad_format.json
- Size
- 623.21 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 6042f391cfbfec45fca783ee9fe1cab3
- Name
- nqii_test_squad_line_format_tok.json
- Size
- 554.61 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 037e1281db1cb66bfc10e7c5e2750fba
- Name
- nqii_train_squad_line_format_tok.json
- Size
- 4.23 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- 962670e806bbb3a3915adda2efeaf62d
- Name
- nqii_train_squad_line_format.json
- Size
- 4.15 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- 9a628ac5ec2e90f795ad55be4cd7eee1
- Name
- nqii_train_squad_format.json
- Size
- 4.12 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- edd2fb5784eec1b6c42cbd1ba4ded1db
- Name
- nqii_train_squad_format_tok.json
- Size
- 4.18 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- 1590dfb4f9555b9c56eb50f9fd0d78c4