Sýna einfalda færslu atriðis

 
dc.contributor.author Þorsteinsson, Vilhjálmur
dc.contributor.author Óladóttir, Hulda
dc.date.accessioned 2020-09-30T16:29:42Z
dc.date.available 2020-09-30T16:29:42Z
dc.date.issued 2020-09-25
dc.identifier.uri http://hdl.handle.net/20.500.12537/80
dc.description Icegrams is a Python 3 package that encapsulates a large trigram library for Icelandic. 14 million unique trigrams and their frequency counts are heavily compressed using radix tries and quasi-succinct indices employing Elias-Fano encoding. This enables the ~43 megabyte compressed trigram file to be mapped directly into memory, with no ex ante decompression, for fast queries (typically ~10 microseconds per lookup). More information at: https://github.com/mideind/Icegrams Icegrams er Python 3 pakki sem inniheldur stórt safn orðaþrennda (trigrams) fyrir íslensku. Í safninu eru um 14 milljónir ólíkra þrennda ásamt tíðniupplýsingum. Öllu safninu hefur verið þjappað niður í u.þ.b. 43 megabæti sem varpað er beint í minni þannig að uppfletting er mjög hraðvirk (~10 míkrósekúndur fyrir hverja uppflettingu). Frekari upplýsingar á: https://github.com/mideind/Icegrams
dc.language.iso isl
dc.publisher Miðeind ehf.
dc.relation.replaces http://hdl.handle.net/20.500.12537/55
dc.relation.isreplacedby http://hdl.handle.net/20.500.12537/176
dc.rights The MIT License (MIT)
dc.rights.uri https://opensource.org/licenses/mit-license.php
dc.rights.label PUB
dc.source.uri https://github.com/mideind/Icegrams/releases/tag/1.0.2
dc.subject language model
dc.subject trigrams
dc.subject ngrams
dc.title Icegrams (2020-09-30)
dc.type languageDescription
metashare.ResourceInfo#ContentInfo.detailedType other
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
contact.person Vilhjálmur Þorsteinsson mideind@mideind.is Miðeind ehf.
sponsor Ministry of Education, Science and Culture Word lists and language models (L4) Language Technology for Icelandic 2019-2023 nationalFunds
size.info 14M trigrams
files.size 43217919
files.count 3


 Files in this item

 Download all files in item (41.22 MB)
This item is
Publicly Available
and licensed under:
The MIT License (MIT)
Icon
Name
trigrams.bin
Size
41.07 MB
Format
Unknown
Description
Binary file containing trigrams
MD5
b94753ceb209da31c1fbf5952655a1b3
 Download file
Icon
Name
Icegrams-1.0.2.tar.gz
Size
69.59 KB
Format
application/gzip
Description
Unknown
MD5
9bcf9458970203b59d29d91832bd6834
 Download file  Preview
 File Preview  
  • Icegrams-1.0.2
    • src
      • icegrams
        • trie.h3 kB
        • trie_build.py4 kB
        • trie.cpp24 kB
        • __init__.py1 kB
        • resources
          • correct.txt57 kB
          • trigrams.bin133 B
          • split.txt8 kB
          • delete.txt1 kB
        • py.typed0 B
        • trie.py12 kB
        • ngrams.py65 kB
    • setup.py4 kB
    • .gitignore890 B
    • README.rst12 kB
    • .travis.yml575 B
    • test.py1 kB
    • test
      • test_ngrams.py3 kB
    • .gitattributes72 B
    • wheels.sh751 B
    • utils
      • rmh.py15 kB
    • doc
      • overview.md8 kB
    • release.sh522 B
    • LICENSE1 kB
    • MANIFEST.in227 B
    • build_wheels.sh745 B
    • pax_global_header52 B
Icon
Name
Icegrams-1.0.2.zip
Size
79.94 KB
Format
application/zip
Description
Unknown
MD5
5c35b8106fa9eef399073e12f1467a5a
 Download file  Preview
 File Preview  
  • Icegrams-1.0.2
    • src
      • icegrams
        • trie.h3 kB
        • trie_build.py4 kB
        • trie.cpp24 kB
        • __init__.py1 kB
        • resources
          • correct.txt57 kB
          • trigrams.bin133 B
          • split.txt8 kB
          • delete.txt1 kB
        • py.typed0 B
        • trie.py12 kB
        • ngrams.py65 kB
    • setup.py4 kB
    • .gitignore890 B
    • README.rst12 kB
    • .travis.yml575 B
    • test.py1 kB
    • test
      • test_ngrams.py3 kB
    • .gitattributes72 B
    • wheels.sh751 B
    • utils
      • rmh.py15 kB
    • doc
      • overview.md8 kB
    • release.sh522 B
    • LICENSE1 kB
    • MANIFEST.in227 B
    • build_wheels.sh745 B

Sýna einfalda færslu atriðis