Show simple item record

 
dc.contributor.author Sigurðardóttir, Helga Svala
dc.date.accessioned 2021-10-25T16:39:28Z
dc.date.available 2021-10-25T16:39:28Z
dc.date.issued 2021-10-01
dc.identifier.uri http://hdl.handle.net/20.500.12537/158
dc.description A corpus of: * 70,000 sentences taken from general text, both before normalization and normalized using Regína normalizer * 70,000 sentences taken from sports news, both before normalization and normalized using Regína normalizer * 40,000 sentences taken from all domains, manually normalized Textasafn sem samanstendur af: * 70,000 setningum af almennum fréttum, bæði fyrir normun og eftir normun með Regínu normara * 70,000 setningum af íþróttafréttum, bæði fyrir normun og eftir normun með Regínu normara * 40,000 handnormuðum setningum úr alls konar texta
dc.language.iso isl
dc.publisher Reykjavik University
dc.relation.isreferencedby https://aclanthology.org/2021.nodalida-main.45.pdf
dc.relation.replaces http://hdl.handle.net/20.500.12537/155
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.source.uri https://github.com/cadia-lvl/regina_normalizer
dc.subject text-normalization
dc.subject normalization
dc.title Text Normalization Corpus 21.10 (2021-10-25)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files no
branding Clarin IS Repository
contact.person Helga Svala Sigurðardóttir helgas@ru.is Reykjavik University
sponsor Ministry of Education, Science and Culture Text Normalization Corpus (T9) Language Technology for Icelandic 2019-2023 nationalFunds
size.info 6 files
files.size 0
files.count 0


Show simple item record