dc.contributor.author | Helgadottir, Inga Run |
dc.contributor.author | Fong, Judy Yum |
dc.contributor.author | Gudnason, Jon |
dc.contributor.author | Thorsteinsdottir, Helga Lara |
dc.date.accessioned | 2020-12-04T16:06:54Z |
dc.date.available | 2020-12-04T16:06:54Z |
dc.date.issued | 2020-12-04 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/93 |
dc.description | The RUV TV set is 6 hours and 43 minutes of TV data from RÚV, from two talk shows and the news: Kastljós (news commentary), Kiljan (literature discussions), and the prime time news (Fréttir kl. 19:00). The data contains 5880 utterances from 151 speakers. The text is normalized, 66000 words in total. The data is aligned and segmented, ready for ASR training. Audio conditions vary between recordings; some are clear with low background noise, and in others music can be heard in the background or other noise. This data set is published by the Icelandic National Broadcasting Service - Ríkisútvarpið (RÚV) and made by both RÚV and Reykjavik University. This work is licensed under the Creative Commons Attribution 4.0 International License. RÚV TV gagnasafnið er 6 klst. og 43 mín af sjónvarpsefni frá RÚV, úr þremur þáttum: Kastljós, Kiljan, og Fréttir kl. 19:00. Gagnasafnið inniheldur 5880 yrðingar frá 151 málhafa. Textinn er staðlaður, 66000 orð í heildina. Gögnin eru samröðun og sneidd, tilbúin til þjálfunar talgreina. Hljóðskilyrðin eru mismunandi eftir upptökum- sumar eru skýrar og litlum bakgrunnshávaða en í oðrum er tónlist og annar hávaði í bakgrunn. Gagnasafnið er gefið út af Ríkisútvarpinu (RÚV) og var unnið af RÚV og Háskólunum í Reykjavík. Þetta verk er gefið út með Creative Commons TilvisunHöfundar 4.0 Alþjóðlegu afnotaleyfi. |
dc.language.iso | isl |
dc.publisher | The Icelandic National Broadcasting Service - Ríkisútvarpið (RÚV) |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.subject | speech recognition |
dc.subject | automatic speech recognition |
dc.subject | television broadcast |
dc.title | RÚV TV data (20.12) |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | audio |
has.files | yes |
branding | Clarin IS Repository |
contact.person | Jon Gudnason jg@ru.is Reykjavik University |
contact.person | Helga Lara Thorsteinsdottir helga.lara.thorsteinsdottir@ruv.is The Icelandic National Broadcasting Service - Rikisútvarpið (RÚV) |
sponsor | Ministry of Education, Science and Culture H2 Language Technology Programme for Icelandic 2019-2023 nationalFunds |
size.info | 5.6 gb |
size.info | 7 hours |
size.info | 5880 utterances |
files.size | 5994921554 |
files.count | 1 |
Files in this item
This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)

- Name
- ruv_tv.tar.gz
- Size
- 5.58 GB
- Format
- application/gzip
- Description
- Unknown
- MD5
- fddfe67d5aedae0961c7890f78b21885
- ruv_tv
- quality_info.txt2 kB
- audio
- 4959408T0.wav187 MB
- 4886131R9.wav258 MB
- 5012551T0.wav411 MB
- 5008565T0.wav179 MB
- 4934441T0.wav165 MB
- 5008564T0.wav274 MB
- 5022004T0.wav164 MB
- 4898511R7.wav221 MB
- 5008572T0.wav186 MB
- 5019095T0.wav407 MB
- 5012565T0.wav496 MB
- 5008568T0.wav177 MB
- 5022028T0.wav180 MB
- 5006014T0.wav492 MB
- 5012553T0.wav412 MB
- 5008567T0.wav164 MB
- 4898535R3.wav225 MB
- 5008566T0.wav183 MB
- 4886107R8.wav232 MB
- 5012552T0.wav324 MB
- 4959359T0.wav186 MB
- 5021995T0.wav235 MB
- 4930605S0.wav212 MB
- 5021994T0.wav178 MB
- 5012569T0.wav493 MB
- 5012548T0.wav391 MB
- 4886083R7.wav283 MB
- 5012547T0.wav406 MB
- 5004302T0.wav252 MB
- 4934417T0.wav176 MB
- 4934466T0.wav167 MB
- 5008569T0.wav164 MB
- 5008571T0.wav176 MB
- 4959383T0.wav257 MB
- 5022010T0.wav169 MB
- README2 kB
- kaldi_data
- utt2spk183 kB
- segments276 kB
- spk2utt138 kB
- wav.scp3 kB
- text539 kB