Show simple item record

 
dc.contributor.author Þorsteinsson, Vilhjálmur
dc.contributor.author Óladóttir, Hulda
dc.date.accessioned 2022-01-21T12:51:29Z
dc.date.available 2022-01-21T12:51:29Z
dc.date.issued 2022
dc.identifier.uri http://hdl.handle.net/20.500.12537/176
dc.description Icegrams is a Python 3 package that encapsulates a large trigram library for Icelandic. 14 million unique trigrams and their frequency counts are heavily compressed using radix tries and quasi-succinct indices employing Elias-Fano encoding. This enables the ~43 megabyte compressed trigram file to be mapped directly into memory, with no ex ante decompression, for fast queries (typically ~10 microseconds per lookup). More information at: https://github.com/mideind/Icegrams Icegrams er Python 3 pakki sem inniheldur stórt safn orðaþrennda (trigrams) fyrir íslensku. Í safninu eru um 14 milljónir ólíkra þrennda ásamt tíðniupplýsingum. Öllu safninu hefur verið þjappað niður í u.þ.b. 43 megabæti sem varpað er beint í minni þannig að uppfletting er mjög hraðvirk (~10 míkrósekúndur fyrir hverja uppflettingu). Frekari upplýsingar á: https://github.com/mideind/Icegrams
dc.language.iso isl
dc.publisher Miðeind ehf.
dc.relation.replaces http://hdl.handle.net/20.500.12537/80
dc.rights The MIT License (MIT)
dc.rights.uri https://opensource.org/licenses/mit-license.php
dc.rights.label PUB
dc.source.uri https://github.com/mideind/Icegrams/releases/tag/1.1.1
dc.subject language model
dc.subject trigrams
dc.subject ngrams
dc.title Icegrams v1.1.1
dc.type languageDescription
metashare.ResourceInfo#ContentInfo.detailedType other
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
contact.person Vilhjálmur Þorsteinsson mideind@mideind.is Miðeind ehf.
sponsor Ministry of Education, Science and Culture Word lists and language models (L4) Language Technology for Icelandic 2019-2023 nationalFunds
size.info 14M trigrams
files.size 154247
files.count 2


 Files in this item

 Download all files in item (150.63 KB)
This item is
Publicly Available
and licensed under:
The MIT License (MIT)
Icon
Name
Icegrams-1.1.1.tar.gz
Size
70.1 KB
Format
application/gzip
Description
Unknown
MD5
978de3eefab2004ad0d6a69889ba0920
 Download file  Preview
 File Preview  
  • Icegrams-1.1.1
    • src
      • icegrams
        • trie.h3 kB
        • trie_build.py4 kB
        • trie.cpp24 kB
        • __init__.py1 kB
        • resources
          • correct.txt57 kB
          • trigrams.bin133 B
          • split.txt8 kB
          • delete.txt1 kB
        • py.typed0 B
        • trie.py12 kB
        • ngrams.py65 kB
    • setup.py4 kB
    • .gitignore890 B
    • README.rst13 kB
    • .travis.yml2 kB
    • test.py1 kB
    • test
      • test_ngrams.py3 kB
    • .gitattributes72 B
    • wheels.sh937 B
    • utils
      • rmh.py15 kB
    • doc
      • overview.md8 kB
    • release.sh522 B
    • LICENSE1 kB
    • MANIFEST.in227 B
    • build_wheels.sh898 B
    • pax_global_header52 B
Icon
Name
Icegrams-1.1.1.zip
Size
80.54 KB
Format
application/zip
Description
Unknown
MD5
18bfa19f6d7d2f02ea133a5759c1bd98
 Download file  Preview
 File Preview  
  • Icegrams-1.1.1
    • src
      • icegrams
        • trie.h3 kB
        • trie_build.py4 kB
        • trie.cpp24 kB
        • __init__.py1 kB
        • resources
          • correct.txt57 kB
          • trigrams.bin133 B
          • split.txt8 kB
          • delete.txt1 kB
        • py.typed0 B
        • trie.py12 kB
        • ngrams.py65 kB
    • setup.py4 kB
    • .gitignore890 B
    • README.rst13 kB
    • .travis.yml2 kB
    • test.py1 kB
    • test
      • test_ngrams.py3 kB
    • .gitattributes72 B
    • wheels.sh937 B
    • utils
      • rmh.py15 kB
    • doc
      • overview.md8 kB
    • release.sh522 B
    • LICENSE1 kB
    • MANIFEST.in227 B
    • build_wheels.sh898 B

Show simple item record