Show simple item record

 
dc.contributor.author Jasonarson, Atli
dc.contributor.author Steingrímsson, Steinþór
dc.contributor.author Sigurðsson, Einar Freyr
dc.contributor.author Daðason, Jón Friðrik
dc.date.accessioned 2022-12-12T09:25:34Z
dc.date.available 2022-12-12T09:25:34Z
dc.date.issued 2022-12-01
dc.identifier.uri http://hdl.handle.net/20.500.12537/302
dc.description ENGLISH: This Universal Dependencies parser for Icelandic was trained with Diaparser [1]. This version of it was trained on v2.11 of UD_Icelandic-IcePaHC [2] and UD_Icelandic-Modern [3]. (Note that texts in UD_Icelandic-Modern [3] labeled RUV_TGS_2017 and RUV_ESP_2017 were not included here as these were originally parsed with COMBO-based UD Parser 22.10 [4] and the output subsequently corrected.) The parser utilizes information from an ELECTRA language model [5]. Its UAS (unlabeled attachment score) is 89.58 and its LAS (labeled attachment score) is 86.46.   ICELANDIC: Þessi UD-þáttari var þjálfaður með Diaparser [1]. Þessi útgáfa hans var þjálfuð á útgáfu 2.11 af UD_Icelandic-IcePaHC [2] og UD_Icelandic-Modern [3]. (Ath. að textar í UD_Icelandic-Modern [3] merktir RUV_TGS_2017 og RUV_ESP_2017 voru ekki notaðir við þjálfunina þar sem þeir voru upphaflega þáttaðir með COMBO-based UD Parser 22.10 [4] og úttakið leiðrétt að því loknu.) Þáttarinn nýtir sér upplýsingar úr ELECTRA-mállíkani [5]. Hann skorar 89.58 á UAS (unlabeled attachment score) og 86.46 á LAS (labeled attachment score). [1] Diaparser: https://github.com/Unipisa/diaparser  [2] UD_Icelandic-IcePaHC: https://github.com/UniversalDependencies/UD_Icelandic-IcePaHC/  [3] UD_Icelandic-Modern: https://github.com/UniversalDependencies/UD_Icelandic-Modern/  [4] COMBO-based UD Parser 22.10: http://hdl.handle.net/20.500.12537/272 [5] electra-base-igc-is: https://huggingface.co/jonfd/electra-base-igc-is
dc.language.iso isl
dc.publisher The Árni Magnússon Institute for Icelandic Studies
dc.relation.replaces http://hdl.handle.net/20.500.12537/273
dc.rights Apache License 2.0
dc.rights.uri https://opensource.org/license/apache2-0-php/
dc.rights.label PUB
dc.subject universal dependencies
dc.subject parsing
dc.title Biaffine-based UD Parser for Icelandic 22.12
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding Clarin IS Repository
contact.person Steinþór Steingrímsson steinthor.steingrimsson@arnastofnun.is The Árni Magnússon Institute for Icelandic Studies
sponsor Ministry of Education, Science and Culture (Mennta- og menningamálaráðuneytið) I5 – Parsers Language Technology for Icelandic 2019-2023 nationalFunds
files.size 1281221820
files.count 6


 Files in this item

 Download all files in item (1.19 GB)
This item is
Publicly Available
and licensed under:
Apache License 2.0
Icon
Name
parse_file.py
Size
609 bytes
Format
Unknown
Description
Usage example
MD5
8c05e9ea1eac7f05e322cacae0b12855
 Download file
Icon
Name
requirements.txt
Size
34 bytes
Format
Text file
Description
requirements
MD5
3d6e6d73fa2dc7fe8f59bd559cc75f8b
 Download file  Preview
 File Preview  
diaparser==1.1.2
Tokenizer==3.4.2 . . .
                                            
Icon
Name
test_file.txt
Size
103 bytes
Format
Text file
Description
File for usage example
MD5
e826929ab7cdf131075abb15f710b634
 Download file  Preview
 File Preview  
Komið þið sæl.
Þetta skjal er ætlað til að sýna hvernig þáttarinn virkar.
Njótið dagsins. . . .
                                            
Icon
Name
diap.zip
Size
438.74 MB
Format
application/zip
Description
The parser itself
MD5
3b10b3f539aa179a773094bf1e52b6a6
 Download file  Preview
 File Preview  
Icon
Name
electra.zip
Size
783.13 MB
Format
application/zip
Description
The transformer model
MD5
a94b9e389a64e58af1d2af01a5c56e69
 Download file  Preview
 File Preview  
  • transformer_models
    • electra-base-igc-is
      • config.json466 B
      • README.md666 B
      • tokenizer_config.json73 B
      • special_tokens_map.json112 B
      • .git
        • logs
        • info
          • exclude240 B
        • config304 B
        • index625 B
        • packed-refs112 B
        • HEAD21 B
        • refs
        • description73 B
        • hooks
          • post-commit278 B
          • push-to-checkout.sample2 kB
          • applypatch-msg.sample478 B
          • pre-push.sample1 kB
          • commit-msg.sample896 B
          • pre-rebase.sample4 kB
          • post-checkout282 B
          • post-update.sample189 B
          • pre-receive.sample544 B
          • pre-push272 B
          • pre-applypatch.sample424 B
          • update.sample3 kB
          • pre-commit.sample1 kB
          • pre-merge-commit.sample416 B
          • fsmonitor-watchman.sample4 kB
          • post-merge276 B
          • prepare-commit-msg.sample1 kB
        • objects
          • 0a
            • 436181c565848a6acde7a8d56b3d0083065d4d127 B
          • a1
            • febe62ff74744a3bdc90765101a93f8165f96c446 B
          • 01
            • 3c0d5067a7209f20d1483e98daf266743c3716265 B
          • info
            • cc
              • af6b68a21f6293f20e96decf431584115c3206126 B
            • e7
              • b0375001f109a6b8873d756ad4f7bbb15fbaa582 B
            • 21
              • b29ab864793dc86d157902941b2ff4bbe2bbca58 B
            • b6
              • e1921af19d17e863490c057c473d1bbe5ece8179 B
            • e2
              • 921de06b441e2a3066da485d6fa31cf5c816a8170 B
            • 42
              • a65ff035a31364a5df021edbca71fc835f8f53133 kB
            • pack
              • d8
                • 8c1c6bf57ec076a4c43dac202f23f71d6cbdad277 B
              • 6d
                • 34772f5ca361021038b404fb913ec8dc0b1a5a193 B
            • branches
              • lfs
            • pytorch_model.bin422 MB
            • .gitattributes1 kB
            • vocab.txt253 kB
        Icon
        Name
        README.md
        Size
        2.91 KB
        Format
        Unknown
        Description
        readme
        MD5
        ff066b0557db17f9282cb59e8bd4ea6f
         Download file

        Show simple item record