Show simple item record

 
dc.contributor.author Hernández Mena, Carlos Daniel
dc.contributor.author Lamhauge, Sandra Saxov
dc.contributor.author Debess, Iben Nyholm
dc.contributor.author Simonsen, Annika
dc.date.accessioned 2022-12-23T09:33:50Z
dc.date.available 2022-12-23T09:33:50Z
dc.date.issued 2022-12-08
dc.identifier.uri http://hdl.handle.net/20.500.12537/304
dc.description - ENGLISH In the context of Automatic Speech Recognition (ASR), a n-gram language model is a plain-text file containing the probabilities of word sequences with distict lengths or "n-grams" (for example, a sequence of one word is a 1-gram, a sequence of two words is a 2-gram and so on). Acoording to this, the "Faroese Language Models with Pronunciations" is a set of n-gram language models in ARPA format along with pronunciation dictionaries containing the words that are present in such language models. ÍSLENSLA Mállíkan byggt á N-stæðum notuð í talgreiningu er textaskrá sem inniheldur líkurnar á ákveðnu orði eða orðarunu (eitt orð er í þessu samhengi 1-stæða, 2 orð 2-stæða og svo framvegis). Samkvæmt þessu er "Færeysk Mállíkön með framburði" safn af N-stæðu mállíkönum á ARPA formi með samsvarandi framburðarorðabókum með öllum orðunum sem koma fyrir í líkönunum.
dc.language.iso fao
dc.publisher Reykjavík University
dc.rights Creative Commons - Attribution 4.0 International (CC BY 4.0)
dc.rights.uri https://creativecommons.org/licenses/by/4.0/
dc.rights.label PUB
dc.subject language model
dc.subject faroese
dc.subject n-gram
dc.subject pronounciation models
dc.subject pronounciation dictionary
dc.title Faroese Language Models with Pronunciations
dc.type languageDescription
metashare.ResourceInfo#ContentInfo.detailedType ngrammodel
metashare.ResourceInfo#ContentInfo.mediaType text
has.files yes
branding Clarin IS Repository
contact.person Carlos Daniel Hernández Mena carlos.mena@ciempiess.org Reykjavík University
size.info 1.2 gb
size.info 1750000 n-grams
size.info 6575192 4-grams
size.info 4839253 trigrams
files.size 571852988
files.count 2


 Files in this item

 Download all files in item (545.36 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Icon
Name
Faroese_LMs_with_Prons.zip
Size
545.35 MB
Format
application/zip
Description
Faroese Language Models
MD5
884ffcca44a5e1259219ba7d4dea3e04
 Download file  Preview
 File Preview  
  • Faroese_LMs_with_Prons
    • README.txt-1 B
    • models
      • Phonemes
        • asr.phones-1 B
        • central.phones-1 B
        • east.phones-1 B
      • 4-gram_lm
        • 4GRAM_ARPA_MODEL_PRUNED.lm-1 B
        • 4GRAM_ARPA_MODEL.lm-1 B
      • Pron_Dicts
        • East_Faroese.dic-1 B
        • Ravnursson_Composite_Words.dic-1 B
        • FAROESE_ASR.dic-1 B
        • Central_Faroese.dic-1 B
        • BLARK.dic-1 B
      • 3-gram_lm
        • 3GRAM_ARPA_MODEL.lm-1 B
        • 3GRAM_ARPA_MODEL_PRUNED.lm-1 B
      • 6-gram-lm
        • FAROESE_6GRAM_NeMo_LM.binary-1 B
Icon
Name
README.txt
Size
8.26 KB
Format
Text file
Description
Readme file
MD5
73e4da19918aafe3122081374003ad4f
 Download file  Preview
 File Preview  
-------------------------------------------------------------------------------
                 Faroese Language Models with Pronunciations
-------------------------------------------------------------------------------

Authors         : Carlos Daniel Hernández Mena, Sandra Saxov Lamhauge,
                  Iben Nyholm Debess, Annika Simonsen.

Language        : Faroese.

Recommended use : speech recognition.

-------------------------------------------------------------------------------
Description
-------------------------------------------------------------------------------

In the context of Automatic Speech Recognition (ASR), a n-gram language model 
is a plain-text file containing the probabilities of word sequences with 
distict lengths or "n-grams" (for example, a sequence of one word is a 1-gram, 
a sequence of two words is a 2-gram and so on). Acoording to this, the "Faroese 
Language Models with Pronunciations" is a set of n-gram language models in ARPA 
format along wit . . .
                                            

Show simple item record