Show simple item record

 
dc.contributor.author Helgadóttir, Sigrún
dc.contributor.author Barkarson, Starkaður
dc.date.accessioned 2020-06-04T14:15:39Z
dc.date.available 2020-06-04T14:15:39Z
dc.date.issued 2018-06-01
dc.identifier.uri http://hdl.handle.net/20.500.12537/32
dc.description The Saga Corpus contains 41 texts of Old Icelandic narrative texts: Family Sagas (Íslendingasögur, 982,006 words), Sturlunga Saga (260,586 words), Sagas of the Kings of Norway (Heimskringla, 231,502 words) and the Book of Settlement (Landnámabók, 37,120 words). The total size of the corpus is 1,511,275 words. The texts of the Family Sagas are taken from the publication of Svart á hvítu (Halldórsson et al. (eds.), 1985-1986) and also the text of Sturlunga Saga (Thorsson et al. (eds.), 1988). The text of Heimskringla is from the publication of Mál og menning from the year 1991 (Kristjánsdóttir et al. (eds.), 1991). The spelling was normalized to Modern Icelandic spelling and some inflectional endings were changed to Modern Icelandic form. The text of the Book of Settlement is from Jakob Benediktsson (1968) but has been normalized to Modern Icelandic spelling in the same way as the other texts. List of the texts can be found here. One of the texts is Íslendingaþættir, a collection of tales, called þættir. The texts have been normalized to Modern Icelandic spelling. Several inflectional endings were also changed to Modern Icelandic form. The texts are encoded in a special xml-format defined by the TEI (Text Encoding Initiative). Bibliographical information is included with all the texts. Þetta safn hefur að geyma 41 forntexta úr Íslendingasögum (982.006 orð), Sturlungu (260.586 orð), Heimskringlu (231.502 orð) og Lannámabók (Sturlubók, 37.120 orð). Texti Íslendingasagna er úr útgáfu Svarts og hvítu (Bragi Halldórsson, Jón Torfason, Sverrir Tómasson og Örnólfur Thorsson (ritstj.), 1985-1986) og texti Sturlungu einnig (Örnólfur Thorsson, Bergljót Kristjánsdóttir, Bragi Halldórsson, Gísli Sigurðsson, Guðrún Ása Grímsdóttir, Guðrún Ingólfsdóttir, Jón Torfason og Sverrir Tómasson (ritstj.), 1988). Texti Heimskringlu er úr útgáfu frá Máli og menningu árið 1991 (Bergljót Kristjánsdóttir, Bragi Halldórsson, Jón Torfason og Örnólfur Thorsson (ritstj.), 1991). Stafsetning var umrituð til nútímastafsetningar og nokkrar beygingarendingar eru færðar til nútímamáls. Texti Landnámabókar úr útgáfu Jakobs Benediktssonar (1968) var færður til nútímastafsetningar á sama hátt. Textarnir eru aðgengilegir í sérstöku xml-sniði sem er skilgreint af TEI (Text Encoding Initiative). Bókfræðilegar upplýsingar fylgja öllum textum.
dc.language.iso isl
dc.language.iso non
dc.publisher The Árni Magnússon Institute for Icelandic Studies
dc.relation.isreferencedby https://notendur.hi.is/eirikur/Tagging_ON.pdf
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.source.uri https://clarin.is/en/resources/sagacorpus/
dc.subject text corpus
dc.subject historical corpus
dc.subject pos-tagged
dc.subject lemmatized
dc.subject family sagas
dc.subject fornritin
dc.title The Saga Corpus (Fornritin)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
demo.uri https://malheildir.arnastofnun.is/?mode=forn#?lang=en
contact.person Eiríkur Rögnvaldsson eirikur.rognvaldsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies
size.info 1511275 tokens
files.size 9610072
files.count 2


 Files in this item

 Download all files in item (9.16 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Icon
Name
Saga Corpus.zip
Size
9.15 MB
Format
application/zip
Description
pos-tagged and lemmatized xml files
MD5
79cd702887219a36d150f743396b5fb2
 Download file  Preview
 File Preview  
  • Saga Corpus
    • F0Q.xml574 kB
    • F12.xml1 MB
    • F0F.xml3 MB
    • F1D.xml6 MB
    • sagHdr.xml149 kB
    • F19.xml958 kB
    • F0X.xml3 MB
    • F0M.xml710 kB
    • F0B.xml1 MB
    • F07.xml469 kB
    • F15.xml1 MB
    • F0T.xml214 kB
    • F0I.xml656 kB
    • F03.xml1 MB
    • F1G.xml2 MB
    • F0P.xml777 kB
    • F11.xml467 kB
    • F0E.xml1 MB
    • F1C.xml176 kB
    • F18.xml579 kB
    • F0L.xml607 kB
    • F0A.xml1 MB
    • F06.xml3 MB
    • F14.xml432 kB
    • F0S.xml677 kB
    • F0H.xml256 kB
    • F02.xml791 kB
    • F1F.xml15 MB
    • F0O.xml936 kB
    • F0D.xml1 MB
    • F09.xml1 MB
    • F1B.xml323 kB
    • F17.xml710 kB
    • F0V.xml674 kB
    • F0K.xml599 kB
    • F05.xml576 kB
    • F13.xml1 MB
    • F0R.xml540 kB
    • F0G.xml359 kB
    • F01.xml580 kB
    • F1E.xml13 MB
    • F0Y.xml1 MB
    • F0N.xml1 MB
    • F0C.xml1 MB
    • F08.xml2 MB
    • F1A.xml215 kB
    • F16.xml1 MB
    • F0U.xml658 kB
    • F0J.xml628 kB
    • F04.xml5 MB
Icon
Name
saga_corpus_metadata.zip
Size
10.14 KB
Format
application/zip
Description
List of texts
MD5
1a5f102544c1824d123d6bf7278f51e4
 Download file  Preview
 File Preview  
    • saga_corpus_metadata.xlsx12 kB

Show simple item record