dc.contributor.author | Hernández Mena, Carlos Daniel |
dc.contributor.author | Lamhauge, Sandra Saxov |
dc.contributor.author | Debess, Iben Nyholm |
dc.contributor.author | Simonsen, Annika |
dc.date.accessioned | 2022-12-23T09:33:50Z |
dc.date.available | 2022-12-23T09:33:50Z |
dc.date.issued | 2022-12-08 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/304 |
dc.description | - ENGLISH In the context of Automatic Speech Recognition (ASR), a n-gram language model is a plain-text file containing the probabilities of word sequences with distict lengths or "n-grams" (for example, a sequence of one word is a 1-gram, a sequence of two words is a 2-gram and so on). Acoording to this, the "Faroese Language Models with Pronunciations" is a set of n-gram language models in ARPA format along with pronunciation dictionaries containing the words that are present in such language models. ÍSLENSLA Mállíkan byggt á N-stæðum notuð í talgreiningu er textaskrá sem inniheldur líkurnar á ákveðnu orði eða orðarunu (eitt orð er í þessu samhengi 1-stæða, 2 orð 2-stæða og svo framvegis). Samkvæmt þessu er "Færeysk Mállíkön með framburði" safn af N-stæðu mállíkönum á ARPA formi með samsvarandi framburðarorðabókum með öllum orðunum sem koma fyrir í líkönunum. |
dc.language.iso | fao |
dc.publisher | Reykjavík University |
dc.rights | Creative Commons - Attribution 4.0 International (CC BY 4.0) |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ |
dc.rights.label | PUB |
dc.subject | language model |
dc.subject | faroese |
dc.subject | n-gram |
dc.subject | pronounciation models |
dc.subject | pronounciation dictionary |
dc.title | Faroese Language Models with Pronunciations |
dc.type | languageDescription |
metashare.ResourceInfo#ContentInfo.detailedType | ngrammodel |
metashare.ResourceInfo#ContentInfo.mediaType | text |
has.files | yes |
branding | Clarin IS Repository |
contact.person | Carlos Daniel Hernández Mena carlos.mena@ciempiess.org Reykjavík University |
size.info | 1.2 gb |
size.info | 1750000 n-grams |
size.info | 6575192 4-grams |
size.info | 4839253 trigrams |
files.size | 571852988 |
files.count | 2 |
Files in this item
Download all files in item (545.36 MB)This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- Faroese_LMs_with_Prons.zip
- Size
- 545.35 MB
- Format
- application/zip
- Description
- Faroese Language Models
- MD5
- 884ffcca44a5e1259219ba7d4dea3e04
- Faroese_LMs_with_Prons
- README.txt-1 B
- models
- Phonemes
- asr.phones-1 B
- central.phones-1 B
- east.phones-1 B
- 4-gram_lm
- 4GRAM_ARPA_MODEL_PRUNED.lm-1 B
- 4GRAM_ARPA_MODEL.lm-1 B
- Pron_Dicts
- East_Faroese.dic-1 B
- Ravnursson_Composite_Words.dic-1 B
- FAROESE_ASR.dic-1 B
- Central_Faroese.dic-1 B
- BLARK.dic-1 B
- 3-gram_lm
- 3GRAM_ARPA_MODEL.lm-1 B
- 3GRAM_ARPA_MODEL_PRUNED.lm-1 B
- 6-gram-lm
- FAROESE_6GRAM_NeMo_LM.binary-1 B
- Phonemes
- Name
- README.txt
- Size
- 8.26 KB
- Format
- Text file
- Description
- Readme file
- MD5
- 73e4da19918aafe3122081374003ad4f
------------------------------------------------------------------------------- Faroese Language Models with Pronunciations ------------------------------------------------------------------------------- Authors : Carlos Daniel Hernández Mena, Sandra Saxov Lamhauge, Iben Nyholm Debess, Annika Simonsen. Language : Faroese. Recommended use : speech recognition. ------------------------------------------------------------------------------- Description ------------------------------------------------------------------------------- In the context of Automatic Speech Recognition (ASR), a n-gram language model is a plain-text file containing the probabilities of word sequences with distict lengths or "n-grams" (for example, a sequence of one word is a 1-gram, a sequence of two words is a 2-gram and so on). Acoording to this, the "Faroese Language Models with Pronunciations" is a set of n-gram language models in ARPA format along wit . . .