dc.contributor.author | Snæbjarnarson, Vésteinn |
dc.contributor.author | Einarsson, Bergur Tareq Tamimi |
dc.contributor.author | Auðunardóttir, Ingibjörg Iða |
dc.contributor.author | Sæmundsson, Unnar Ingi |
dc.contributor.author | Bjarnadóttir, Hildur |
dc.contributor.author | Gunnarsson, Helgi Valur |
dc.contributor.author | Einarsson, Hafsteinn |
dc.date.accessioned | 2022-02-01T08:34:33Z |
dc.date.available | 2022-02-01T08:34:33Z |
dc.date.issued | 2022-02-01 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/188 |
dc.description | NQiI - v. 1.0 The Natural Questions in Icelandic (NQiI) is a question answering dataset suitable for both reading comprehension style and open-domain question answering in Icelandic. The answer context is from the Icelandic Wikipedia. Included here is the full NQiI v. 1.0 dataset along with a subset in the same format as the Stanford Question Answering Dataset (SQuAD). The data is provided both tokenized and not tokenized. The data is included in json format and jsonl format (one json object per line). The dataset is further described in Vésteinn Snæbjarnarson, 2021, Automated methods for Question-Answering in Icelandic, M.Sc. thesis, Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland. --- Natural Questions in Icelandic (NQiI) er spurningasvörunar gagnasett sem hæfir bæði lesskilnings- og opinni spurningasvörun á Íslensku. Svörin eru merkt inn í íslenska hluta Wikipedia. Í þessari hirslu er að finna allt NQiI safnið ásamt hlutmengi þess sem er á sama sniði og Stanford Question Answering Dataset (SQuAD). Gögnin eru bæði aðgengileg tókuð og ótókuð. Snið gagnanna er json og jsonl. Málheildinni er frekar lýst í Vésteinn Snæbjarnarson, 2021, Automated methods for Question-Answering in Icelandic, M.Sc. thesis, Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland. |
dc.language.iso | isl |
dc.publisher | Háskóli Íslands |
dc.relation.isreferencedby | http://hdl.handle.net/1946/39966 |
dc.relation.replaces | https://repository.clarin.is/repository/xmlui/handle/20.500.12537/143 |
dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by-sa/4.0/ |
dc.rights.label | PUB |
dc.subject | qa |
dc.subject | question answering |
dc.title | NQiI v. 1.1 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
contact.person | Vésteinn Snæbjarnarson vesteinnsnaebjarnarson@gmail.com Miðeind |
sponsor | Rannís Nýsköpunarsjóður námsmanna 2020 Málheild fyrir almenna spurningasvörun á íslensku nationalFunds |
size.info | 18230 entries |
files.size | 31405211 |
files.count | 14 |
Files in this item
Download all files in item (29.95 MB)This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
- Name
- README
- Size
- 609 bytes
- Format
- Unknown
- Description
- Unknown
- MD5
- 0d49f51fa9e417008d057145301e411b
- Name
- nqii_test_squad_format_tok.json
- Size
- 524.81 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- da4540e691d8909035a78b695dff2c6a
- Name
- nqii_test_squad_line_format.json
- Size
- 549.65 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 2aac926995b5db47b9d9e33081990ea8
- Name
- nqii_test_squad_format.json
- Size
- 623.21 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 6042f391cfbfec45fca783ee9fe1cab3
- Name
- nqii_test_squad_line_format_tok.json
- Size
- 554.61 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 037e1281db1cb66bfc10e7c5e2750fba
- Name
- nqii_dev_squad_line_format_tok.json
- Size
- 497.79 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- fc34b20f7f71e9de28bad9804f9e49ab
- Name
- nqii_dev_squad_format.json
- Size
- 483.66 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- 5094d96e336aaf5a64643e4ba45ed0b1
- Name
- nqii_dev_squad_line_format.json
- Size
- 439.23 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- b54d0f295c2e953c9f927f0be7c994cb
- Name
- nqii_train_squad_format.json
- Size
- 4.12 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- edd2fb5784eec1b6c42cbd1ba4ded1db
- Name
- nqii_dev_squad_format_tok.json
- Size
- 566.62 KB
- Format
- Unknown
- Description
- Unknown
- MD5
- cab77356251e71c2b82b06c799e39a30
- Name
- nqii_train_squad_line_format_tok.json
- Size
- 4.23 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- 962670e806bbb3a3915adda2efeaf62d
- Name
- nqii_train_squad_line_format.json
- Size
- 4.15 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- 9a628ac5ec2e90f795ad55be4cd7eee1
- Name
- nqii_train_squad_format_tok.json
- Size
- 4.18 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- 1590dfb4f9555b9c56eb50f9fd0d78c4
- Name
- NQiI_raw_v1.0.json
- Size
- 9.14 MB
- Format
- Unknown
- Description
- Unknown
- MD5
- 959a56d27d688e95d7af91d000fd9fa8