Show simple item record

 
dc.contributor.author Einarsson, Hafsteinn
dc.contributor.author Friðriksdóttir, Steinunn Rut
dc.contributor.author Arnardóttir, Þórunn
dc.date.accessioned 2024-11-07T09:13:13Z
dc.date.available 2024-11-07T09:13:13Z
dc.date.issued 2024-10-24
dc.identifier.uri http://hdl.handle.net/20.500.12537/352
dc.description Íslenska lyndisgreiningarmálheildin (Hotter and Colder) er yfirgripsmikið safn af merktum ummælum við bloggfærslur sem inniheldur samhengisupplýsingar og lýsigögn með yfir 19000 merkingum. Hver færsla táknar eitt merkingaverkefni sem notandi framkvæmir á tiltekin ummæli, sem inniheldur alla ummælasöguna og bloggfærsluna. Nota þarf hydration.py skriftuna til að sækja ummæli og blogg tengd merkingu. Málheildin nær yfir ýmis merkingaverkefni sem beinast að mismunandi þáttum netumræðu, þar á meðal lyndisgreiningu, greiningu á eitruðum ummælum, hatursorðræðugreiningu, mat á félagslegu samþykki, tilfinningagreiningu, kaldhæðnisgreiningu, mat á uppbyggileika, greiningu á hvatningu og samúð, mat á kurteisi, greiningu á nettröllaskap, greiningu á hrútskýringum og greiningu á alhæfingum um hópa. Uppbygging gagnasafnsins samanstendur af nokkrum meginreitum sem verða aðgengilegir eftir að hydration.py skriftan er keyrð, þar á meðal user_id til að auðkenna merkjandann, annotation_task_name sem tilgreinir tegund verkefnis, label_given_by_user sem inniheldur svör merkjandans, comment_annotated með fullum texta sem er greindur, comment_author_name, comment_datetime fyrir tímastimpil færslu, previous_comments sem sýnir fyrri ummæli og blog_post sem inniheldur bloggfærsluna sem ummælin eru við. The Icelandic Sentiment Corpus (Hotter and Colder) is a comprehensive collection of annotated blog comments that includes contextual information and metadata with over 19000 annotations. Each entry represents a single annotation task performed by a user on a specific comment, containing the complete comment history and blog post context. The dataset, which requires hydration using the hydration.py script to fetch comments and blogs related to an annotation, encompasses various annotation tasks focused on different aspects of online discourse, including sentiment analysis, toxicity detection, hate speech identification, social acceptance evaluation, emotion detection, sarcasm recognition, constructiveness assessment, encouragement and sympathy detection, politeness evaluation, trolling identification, mansplaining detection, and analysis of group generalizations. The dataset's structure consists of several main fields that become available after hydration, including user_id for identifying the annotator, annotation_task_name specifying the type of task, label_given_by_user containing the annotator's response, comment_annotated with the full text being analyzed, comment_author_name, comment_datetime for posting timestamp, previous_comments showing the thread history, and blog_post containing the full context.
dc.language.iso isl
dc.publisher Háskóli Íslands
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.source.uri https://www.ummælagreining.is
dc.subject Sentiment Analysis
dc.subject Toxicity detection
dc.subject Hate speech detection
dc.subject Emotion Analysis
dc.subject Social Acceptability Analysis
dc.title Icelandic Sentiment Corpus (Hotter and Colder)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
demo.uri https://www.ummælagreining.is
contact.person Hafsteinn Einarsson hafsteinne@hi.is Háskóli Íslands
sponsor Ministry of Culture and Business Affairs Bias and Toxicity (G12) Language Technology for Icelandic nationalFunds
size.info 19828 entries
files.size 294803
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Icon
Name
Icelandic_Sentiment_Corpus.zip
Size
287.89 KB
Format
application/zip
Description
labels and hydration script
MD5
6f26a58c5771158c0f9492096222ad6c
 Download file  Preview
 File Preview  
  • clarin_submission
    • hydration.py10 kB
    • README.md3 kB
    • data_unhydrated.csv1 MB
    • requirements.txt37 B

Show simple item record