Show simple item record

 
dc.contributor.author Haukur Barri, Símonarson
dc.contributor.author Snæbjarnarson, Vésteinn
dc.contributor.author Ragnarsson, Pétur Orri
dc.contributor.author Jónsson, Haukur Páll
dc.contributor.author Ingólfsdóttir, Svanhvít Lilja
dc.contributor.author Þorsteinsson, Vilhjálmur
dc.date.accessioned 2022-09-23T09:18:50Z
dc.date.available 2022-09-23T09:18:50Z
dc.date.issued 2022-09-20
dc.identifier.uri http://hdl.handle.net/20.500.12537/257
dc.description GreynirSeq is a natural language parsing toolkit for Icelandic focused on sequence modeling with neural networks. The modeling part (nicenlp) of GreynirSeq is built on top of the excellent Fairseq from Meta (which is built on top of PyTorch). Interfaces for POS-tagging, NER-tagging and machine translation are included in this version v.0.2.0. For updated versions of the software please refer to https://github.com/mideind/GreynirSeq -- GreynirSeq er málvinnsluhugbúnaður fyrir íslensku með áherslu á notkun runulíkana sem byggja á tauganetum. Sá hluti sem snýr að tauganetum er byggður á Fairseq frá Meta og byggir á PyTorch. Í þessari útgáfu (v0.2.0) er stuðningur við orðflokkagreiningu, nafnamörkun og þýðingu í gegnum viðmót á skipanalínu. Nýjustu útgáfu af hugbúnaðinum má ávallt finna á https://github.com/mideind/GreynirSeq
dc.language.iso isl
dc.publisher Miðeind ehf
dc.relation.isreferencedby http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.464.pdf
dc.rights The MIT License (MIT)
dc.rights.uri https://opensource.org/licenses/mit-license.php
dc.rights.label PUB
dc.source.uri https://github.com/mideind/GreynirSeq
dc.subject toolkit
dc.subject models
dc.subject sequence modeling
dc.title GreynirSeq - A Natural Language Processing Toolkit for Icelandic (v0.2.0)
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding Clarin IS Repository
demo.uri https://github.com/mideind/GreynirSeq
contact.person Vésteinn Snæbjarnasron vesteinn@mideind.is Miðeind ehf
sponsor Ministry of Education, Science and Culture V4a - MT for Icelandic Language Technology for Icelandic 2019-2023 nationalFunds
files.size 1804624
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
The MIT License (MIT)
Icon
Name
GreynirSeq-0.2.0.zip
Size
1.72 MB
Format
application/zip
Description
Unknown
MD5
8d1b588c788667f7ec672c6fda2643c2
 Download file  Preview
 File Preview  
  • GreynirSeq-0.2.0
    • src
      • greynirseq
        • cli
          • greynirseq_main.py11 kB
          • __init__.py0 B
        • utils
          • ifd_utils.py10 kB
          • split_train_dev.py2 kB
          • __init__.py0 B
          • tests
            • test_ifd_label_schema.py690 B
          • train_byte_bpe.py1 kB
          • preprocessing
            • parse_ifd.py1 kB
            • deduplicate.py3 kB
            • filters.py30 kB
            • __init__.py0 B
            • symbols.py11 kB
          • qa
            • lemmatizer.py5 kB
            • __init__.py0 B
            • split_wiki.py2 kB
          • split_train_dev_rand.py2 kB
          • ifd_label_schema.py7 kB
          • tokenize_splitter.py1 kB
          • bpe
            • multiprocessing_bpe_encoder.py3 kB
        • settings.py942 B
        • noising
          • README.md210 B
          • generate_errors.py2 kB
          • __init__.py0 B
          • test_file.txt501 B
          • ieg
            • dataset.py3 kB
            • __init__.py94 B
            • errorrules
              • errors.py1 kB
              • __init__.py295 B
              • noise.py468 B
              • swap.py795 B
              • dativitis.py2 kB
              • mood.py929 B
              • nouncase.py1 kB
              • duplicate.py647 B
              • spaces.py1 kB
            • spelling
              • errorify.py2 kB
              • __init__.py0 B
              • rules
                • simple.txt1 kB
                • regex.txt51 B
                • word_pairs_combined.txt7 MB
              • rules.py10 kB
              • fixed_random.py646 B
        • __init__.py34 B
        • nicenlp
          • utils
            • linear_sampler.py2 kB
            • __init__.py0 B
            • dictionary.py1 kB
            • reshape_mbart_checkpoint_embeddings.py2 kB
            • logits_filter.py2 kB
            • label_schema
              • readme.md2 kB
              • __init__.py0 B
              • label_schema.py2 kB
            • ner_parser.py1 kB
            • tests.py1 kB
            • constituency
              • tree_dist.pyx10 kB
              • prep_greynir.py4 kB
              • token_utils.py703 B
              • chart_parser.pyx6 kB
              • preprocess_labelled_spans.py12 kB
              • greynir_utils.py36 kB
              • __init__.py0 B
              • unary_branch_labels.py5 kB
          • examples
            • translation
              • README.md4 kB
            • pos
              • prep_mim_pos.sh3 kB
              • README.md3 kB
              • labdict.txt359 B
              • train.sh1 kB
              • terms.json4 kB
            • ner
              • README.md2 kB
              • train.sh1 kB
              • predict_ner.py1 kB
              • ner_train.sh1 kB
            • constituency_parsing
              • predict_file.py3 kB
              • README.md1 kB
              • infer_file.sh357 B
              • pretrain.sh2 kB
              • finetune.sh2 kB
              • encode_data.py873 B
              • Dockerfile1 kB
              • prepare_data.sh8 kB
          • criterions
            • parser_criterion.py12 kB
            • multilabel_token_classification_criterion.py7 kB
            • multiclass_token_classification_criterion.py3 kB
            • __init__.py0 B
            • multi_span_prediction_criterion.py18 kB
          • __init__.py560 B
          • data
            • tests.py2 kB
            • mutex_binary_dataset.py2 kB
            • lookup_dataset.py762 B
            • encoding.py781 B
            • lazymmapdataset.py2 kB
            • __init__.py139 B
            • datasets.py12 kB
          • tasks
            • translation_with_glossary.py16 kB
            • multilabel_token_classification_task.py12 kB
            • parser_task.py8 kB
            • translation_with_backtranslation.py28 kB
            • multiclass_token_classification_task.py4 kB
            • __init__.py24 B
            • multilabel_multispan_classification.py7 kB
            • translation_from_pretrained_bart_with_domain.py9 kB
          • models
            • simple_parser.py8 kB
            • bart.py2 kB
            • __init__.py1 kB
            • multilabel.py10 kB
            • multiclass.py8 kB
        • serve
          • serve_all.py8 kB
          • __init__.py0 B
        • ner
          • patcher.py4 kB
          • aligner.py18 kB
          • README.md3 kB
          • ner_f1_stats.py5 kB
          • postagger.py3 kB
          • nertagger.py3 kB
          • __init__.py0 B
          • testdata
            • names.txt106 B
            • en_is.tsv189 B
    • README.md3 kB
    • .gitignore2 kB
    • tests
      • conftest.py0 B
      • __init__.py0 B
      • test_glossary.py8 kB
      • ner
        • test_bioparser.py621 B
    • .flake858 B
    • assets
      • greynir-logo-large.png56 kB
    • build.py2 kB
    • run_linter.sh108 B
    • pyproject.toml2 kB
    • GNU_AFFERO_LICENSE31 kB
    • .github
    • LICENSE1 kB
    • hubconf.py2 kB

Show simple item record