Spjallromur - Icelandic Conversational Speech About the Spjallrómur corpus ---------------------------- Spjallromur is an open source conversational speech corpus for speech technology development. The corpus is 21 hrs and 20 mins long, with 54 total conversations, 102 speakers. The data was collected for one year (September 2020 - September 2021) by Reykjavík University. There are two parts, the first part has full conversations, while the second part has half conversations. The dataset was primarily created for automatic speech recognition but due to the nature of the dataset, it can also be used for other speech technology fields such as: speaker identification, speaker diarization, and conversational language modeling. Spjallrómur was collected using a custom made online chatting platform called spjall, which is Icelandic for chat. Each speaker used their own microphones (some picked up background noise like the neighboring speakers or other speakers). and devices. The audio from each microphone was saved to a separate audio file, .WAV. There are two speakers per conversation. The speaker set contains both native and non-native Icelandic speakers. All speakers are adults. Each conversation has two sets of demographics metadata, audio file, and transcript, one file for each speaker. Due to some network lag there is sometimes a small difference in length of the two audio files in a conversation. As there were a limited number of participants , some speakers may be in more than one conversation. The text has not been aligned with the audio. The full conversations contain 19 hrs of 48 full conversations, 96 speakers. The half conversations contain 2 hrs 20 mins of 6 partial conversations, 6 speakers. Personally identifiable information has been redacted in the audio with a 400Hz beep and replaced with XXX in the transcript. Non words are marked with () or []. Partial words are marked with [HIK: ..]. The structure of the corpus --------------------------- | . - docs/ | . - spjallromur_README.txt | . - data/ | . - half_conversations/ | . - xxxx/ | . - speaker_x_convo_xxxx_demographics.json | . - speaker_x_convo_xxxx_transcript.json | . - speaker_x_convo_xxxx.wav . - full_conversations/ | . - xxxx/ | . - speaker_a_convo_xxxx_demographics.json | . - speaker_a_convo_xxxx_transcript.json | . - speaker_a_convo_xxxx.wav | . - speaker_b_convo_xxxx_demographics.json | . - speaker_b_convo_xxxx_transcript.json | . - speaker_b_convo_xxxx.wav * speaker_x_convo_xxxx.wav - Each audio file is 16 bit, 16000 kHz, single channel WAVE. It contains the voice of speaker x in convo xxxx. * speaker_x_convo_xxxx_transcript.json - This is a json file of the transcript and corresponding metadata. It was generated within the Tiro text editor (https://tal.tiro.is) In the subject tag, client_x = speaker_x * speaker_x_convo_xxxx_demographics.json - This file contains the convo_xxxx as session_id, with the speaker's age, gender, audio duration in seconds, and sample rate of the audio file The demographics json file contains gender and age in icelandic. Here's the mapping from icelandic to english: gender id: kona, name: 'female' id: karl, name: 'male' id: annad, name: 'other' age group id: 'unglingur', name: '18-19' id: 'tvitugt', name: '20-29' id: 'thritugt', name: '30-39' id: 'fertugt', name: '40-49' id: 'fimmtugt', name: '50-59' id: 'sextugt', name: '60-69' id: 'sjotugt', name: '70-79' id: 'attraett', name: '80-89' id: 'niraett', name: '90+' Authors ------- Reykjavík University Judy Y Fong - judy@judyyfong.xyz Staffan Hedström Ólafur Helgi Jónsson Lára Margrét H. Hólmfriðardóttir Sunneva Þorsteinsdóttir Málfriður Anna Eiríksdóttir David Erik Mollberg Eydís Huld Magnúsdóttir Ragnheiður Þórhallsdóttir Jon Gudnason - jg@ru.is Acknowledgements ---------------- Special thanks to the other members of the Language and Voice Lab (https://lvl.ru.is), the student employees, Róbert Kjaran, and Magnús Teitsson. This project was funded by the Language Technology Programme for Icelandic 2019-2023. The programme, which is managed and coordinated by Almannarómur, is funded by the Icelandic Ministry of Education, Science and Culture. This project was funded in part by the the Icelandic Directorate of Labour's student summer job program in 2021. Citations --------- @misc{fong-spjallromur, title = {Spjallromur - Icelandic Conversational Speech}, author = {Fong, Judy Y and Hedstr{\"o}m, Staffan and J{\'o}nsson, {\'O}lafur Helgi and H{\'o}lmfri{\dh}ard{\'o}ttir, L{\'a}ra Margr{\'e}t H. and {\TH}orsteinsd{\'o}ttir, Sunneva and Eir{\'{\i}}ksd{\'o}ttir, M{\'a}lfri{\dh}ur Anna and Mollberg, David Erik and Magn{\'u}sd{\'o}ttir, Eyd{\'{\i}}s Huld and {\TH}{\'o}rhallsd{\'o}ttir, Ragnhei{\dh}ur and Gudnason, Jon}, url = {}, note = {{CLARIN}-{IS}}, copyright = {Creative Commons - Attribution 4.0 International ({CC} {BY} 4.0)}, year = {2022} } License ------ This dataset is released under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. (https://creativecommons.org/licenses/by/4.0/)