dc.contributor.author | Helgadóttir, Sigrún |
dc.contributor.author | Barkarson, Starkaður |
dc.date.accessioned | 2020-06-04T14:15:39Z |
dc.date.available | 2020-06-04T14:15:39Z |
dc.date.issued | 2018-06-01 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/32 |
dc.description | The Saga Corpus contains 41 texts of Old Icelandic narrative texts: Family Sagas (Íslendingasögur, 982,006 words), Sturlunga Saga (260,586 words), Sagas of the Kings of Norway (Heimskringla, 231,502 words) and the Book of Settlement (Landnámabók, 37,120 words). The total size of the corpus is 1,511,275 words. The texts of the Family Sagas are taken from the publication of Svart á hvítu (Halldórsson et al. (eds.), 1985-1986) and also the text of Sturlunga Saga (Thorsson et al. (eds.), 1988). The text of Heimskringla is from the publication of Mál og menning from the year 1991 (Kristjánsdóttir et al. (eds.), 1991). The spelling was normalized to Modern Icelandic spelling and some inflectional endings were changed to Modern Icelandic form. The text of the Book of Settlement is from Jakob Benediktsson (1968) but has been normalized to Modern Icelandic spelling in the same way as the other texts. List of the texts can be found here. One of the texts is Íslendingaþættir, a collection of tales, called þættir. The texts have been normalized to Modern Icelandic spelling. Several inflectional endings were also changed to Modern Icelandic form. The texts are encoded in a special xml-format defined by the TEI (Text Encoding Initiative). Bibliographical information is included with all the texts. Þetta safn hefur að geyma 41 forntexta úr Íslendingasögum (982.006 orð), Sturlungu (260.586 orð), Heimskringlu (231.502 orð) og Lannámabók (Sturlubók, 37.120 orð). Texti Íslendingasagna er úr útgáfu Svarts og hvítu (Bragi Halldórsson, Jón Torfason, Sverrir Tómasson og Örnólfur Thorsson (ritstj.), 1985-1986) og texti Sturlungu einnig (Örnólfur Thorsson, Bergljót Kristjánsdóttir, Bragi Halldórsson, Gísli Sigurðsson, Guðrún Ása Grímsdóttir, Guðrún Ingólfsdóttir, Jón Torfason og Sverrir Tómasson (ritstj.), 1988). Texti Heimskringlu er úr útgáfu frá Máli og menningu árið 1991 (Bergljót Kristjánsdóttir, Bragi Halldórsson, Jón Torfason og Örnólfur Thorsson (ritstj.), 1991). Stafsetning var umrituð til nútímastafsetningar og nokkrar beygingarendingar eru færðar til nútímamáls. Texti Landnámabókar úr útgáfu Jakobs Benediktssonar (1968) var færður til nútímastafsetningar á sama hátt. Textarnir eru aðgengilegir í sérstöku xml-sniði sem er skilgreint af TEI (Text Encoding Initiative). Bókfræðilegar upplýsingar fylgja öllum textum. |
dc.language.iso | isl |
dc.language.iso | non |
dc.publisher | The Árni Magnússon Institute for Icelandic Studies |
dc.relation.isreferencedby | https://notendur.hi.is/eirikur/Tagging_ON.pdf |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.source.uri | https://clarin.is/en/resources/sagacorpus/ |
dc.subject | text corpus |
dc.subject | historical corpus |
dc.subject | pos-tagged |
dc.subject | lemmatized |
dc.subject | family sagas |
dc.subject | fornritin |
dc.title | The Saga Corpus (Fornritin) |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
demo.uri | https://malheildir.arnastofnun.is/?mode=forn#?lang=en |
contact.person | Eiríkur Rögnvaldsson eirikur.rognvaldsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies |
size.info | 1511275 tokens |
files.size | 9610072 |
files.count | 2 |
Files in this item
Download all files in item (9.16 MB)This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- Saga Corpus.zip
- Size
- 9.15 MB
- Format
- application/zip
- Description
- pos-tagged and lemmatized xml files
- MD5
- 79cd702887219a36d150f743396b5fb2
- Saga Corpus
- F0Q.xml574 kB
- F12.xml1 MB
- F0F.xml3 MB
- F1D.xml6 MB
- sagHdr.xml149 kB
- F19.xml958 kB
- F0X.xml3 MB
- F0M.xml710 kB
- F0B.xml1 MB
- F07.xml469 kB
- F15.xml1 MB
- F0T.xml214 kB
- F0I.xml656 kB
- F03.xml1 MB
- F1G.xml2 MB
- F0P.xml777 kB
- F11.xml467 kB
- F0E.xml1 MB
- F1C.xml176 kB
- F18.xml579 kB
- F0L.xml607 kB
- F0A.xml1 MB
- F06.xml3 MB
- F14.xml432 kB
- F0S.xml677 kB
- F0H.xml256 kB
- F02.xml791 kB
- F1F.xml15 MB
- F0O.xml936 kB
- F0D.xml1 MB
- F09.xml1 MB
- F1B.xml323 kB
- F17.xml710 kB
- F0V.xml674 kB
- F0K.xml599 kB
- F05.xml576 kB
- F13.xml1 MB
- F0R.xml540 kB
- F0G.xml359 kB
- F01.xml580 kB
- F1E.xml13 MB
- F0Y.xml1 MB
- F0N.xml1 MB
- F0C.xml1 MB
- F08.xml2 MB
- F1A.xml215 kB
- F16.xml1 MB
- F0U.xml658 kB
- F0J.xml628 kB
- F04.xml5 MB
- Name
- saga_corpus_metadata.zip
- Size
- 10.14 KB
- Format
- application/zip
- Description
- List of texts
- MD5
- 1a5f102544c1824d123d6bf7278f51e4