dc.contributor.author | Haukur Barri, Símonarson |
dc.contributor.author | Snæbjarnarson, Vésteinn |
dc.contributor.author | Ragnarsson, Pétur Orri |
dc.contributor.author | Jónsson, Haukur Páll |
dc.contributor.author | Ingólfsdóttir, Svanhvít Lilja |
dc.contributor.author | Þorsteinsson, Vilhjálmur |
dc.date.accessioned | 2022-09-23T09:18:50Z |
dc.date.available | 2022-09-23T09:18:50Z |
dc.date.issued | 2022-09-20 |
dc.identifier.uri | http://hdl.handle.net/20.500.12537/257 |
dc.description | GreynirSeq is a natural language parsing toolkit for Icelandic focused on sequence modeling with neural networks. The modeling part (nicenlp) of GreynirSeq is built on top of the excellent Fairseq from Meta (which is built on top of PyTorch). Interfaces for POS-tagging, NER-tagging and machine translation are included in this version v.0.2.0. For updated versions of the software please refer to https://github.com/mideind/GreynirSeq -- GreynirSeq er málvinnsluhugbúnaður fyrir íslensku með áherslu á notkun runulíkana sem byggja á tauganetum. Sá hluti sem snýr að tauganetum er byggður á Fairseq frá Meta og byggir á PyTorch. Í þessari útgáfu (v0.2.0) er stuðningur við orðflokkagreiningu, nafnamörkun og þýðingu í gegnum viðmót á skipanalínu. Nýjustu útgáfu af hugbúnaðinum má ávallt finna á https://github.com/mideind/GreynirSeq |
dc.language.iso | isl |
dc.publisher | Miðeind ehf |
dc.relation.isreferencedby | http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.464.pdf |
dc.rights | The MIT License (MIT) |
dc.rights.uri | https://opensource.org/licenses/mit-license.php |
dc.rights.label | PUB |
dc.source.uri | https://github.com/mideind/GreynirSeq |
dc.subject | toolkit |
dc.subject | models |
dc.subject | sequence modeling |
dc.title | GreynirSeq - A Natural Language Processing Toolkit for Icelandic (v0.2.0) |
dc.type | toolService |
metashare.ResourceInfo#ContentInfo.detailedType | tool |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent | true |
has.files | yes |
branding | Clarin IS Repository |
demo.uri | https://github.com/mideind/GreynirSeq |
contact.person | Vésteinn Snæbjarnasron vesteinn@mideind.is Miðeind ehf |
sponsor | Ministry of Education, Science and Culture V4a - MT for Icelandic Language Technology for Icelandic 2019-2023 nationalFunds |
files.size | 1804624 |
files.count | 1 |
Files in this item
- Name
- GreynirSeq-0.2.0.zip
- Size
- 1.72 MB
- Format
- application/zip
- Description
- Unknown
- MD5
- 8d1b588c788667f7ec672c6fda2643c2
- GreynirSeq-0.2.0
- src
- greynirseq
- cli
- greynirseq_main.py11 kB
- __init__.py0 B
- utils
- ifd_utils.py10 kB
- split_train_dev.py2 kB
- __init__.py0 B
- tests
- test_ifd_label_schema.py690 B
- train_byte_bpe.py1 kB
- preprocessing
- parse_ifd.py1 kB
- deduplicate.py3 kB
- filters.py30 kB
- __init__.py0 B
- symbols.py11 kB
- qa
- lemmatizer.py5 kB
- __init__.py0 B
- split_wiki.py2 kB
- split_train_dev_rand.py2 kB
- ifd_label_schema.py7 kB
- tokenize_splitter.py1 kB
- bpe
- multiprocessing_bpe_encoder.py3 kB
- settings.py942 B
- noising
- README.md210 B
- generate_errors.py2 kB
- __init__.py0 B
- test_file.txt501 B
- ieg
- dataset.py3 kB
- __init__.py94 B
- errorrules
- errors.py1 kB
- __init__.py295 B
- noise.py468 B
- swap.py795 B
- dativitis.py2 kB
- mood.py929 B
- nouncase.py1 kB
- duplicate.py647 B
- spaces.py1 kB
- spelling
- errorify.py2 kB
- __init__.py0 B
- rules
- simple.txt1 kB
- regex.txt51 B
- word_pairs_combined.txt7 MB
- rules.py10 kB
- fixed_random.py646 B
- __init__.py34 B
- nicenlp
- utils
- linear_sampler.py2 kB
- __init__.py0 B
- dictionary.py1 kB
- reshape_mbart_checkpoint_embeddings.py2 kB
- logits_filter.py2 kB
- label_schema
- readme.md2 kB
- __init__.py0 B
- label_schema.py2 kB
- ner_parser.py1 kB
- tests.py1 kB
- constituency
- tree_dist.pyx10 kB
- prep_greynir.py4 kB
- token_utils.py703 B
- chart_parser.pyx6 kB
- preprocess_labelled_spans.py12 kB
- greynir_utils.py36 kB
- __init__.py0 B
- unary_branch_labels.py5 kB
- examples
- translation
- README.md4 kB
- pos
- prep_mim_pos.sh3 kB
- README.md3 kB
- labdict.txt359 B
- train.sh1 kB
- terms.json4 kB
- ner
- README.md2 kB
- train.sh1 kB
- predict_ner.py1 kB
- ner_train.sh1 kB
- constituency_parsing
- predict_file.py3 kB
- README.md1 kB
- infer_file.sh357 B
- pretrain.sh2 kB
- finetune.sh2 kB
- encode_data.py873 B
- Dockerfile1 kB
- prepare_data.sh8 kB
- translation
- criterions
- parser_criterion.py12 kB
- multilabel_token_classification_criterion.py7 kB
- multiclass_token_classification_criterion.py3 kB
- __init__.py0 B
- multi_span_prediction_criterion.py18 kB
- __init__.py560 B
- data
- tests.py2 kB
- mutex_binary_dataset.py2 kB
- lookup_dataset.py762 B
- encoding.py781 B
- lazymmapdataset.py2 kB
- __init__.py139 B
- datasets.py12 kB
- tasks
- translation_with_glossary.py16 kB
- multilabel_token_classification_task.py12 kB
- parser_task.py8 kB
- translation_with_backtranslation.py28 kB
- multiclass_token_classification_task.py4 kB
- __init__.py24 B
- multilabel_multispan_classification.py7 kB
- translation_from_pretrained_bart_with_domain.py9 kB
- models
- simple_parser.py8 kB
- bart.py2 kB
- __init__.py1 kB
- multilabel.py10 kB
- multiclass.py8 kB
- utils
- serve
- serve_all.py8 kB
- __init__.py0 B
- ner
- patcher.py4 kB
- aligner.py18 kB
- README.md3 kB
- ner_f1_stats.py5 kB
- postagger.py3 kB
- nertagger.py3 kB
- __init__.py0 B
- testdata
- names.txt106 B
- en_is.tsv189 B
- cli
- greynirseq
- README.md3 kB
- .gitignore2 kB
- tests
- conftest.py0 B
- __init__.py0 B
- test_glossary.py8 kB
- ner
- test_bioparser.py621 B
- .flake858 B
- assets
- greynir-logo-large.png56 kB
- build.py2 kB
- run_linter.sh108 B
- pyproject.toml2 kB
- GNU_AFFERO_LICENSE31 kB
- .github
- workflows
- superlinter.yml1 kB
- workflows
- LICENSE1 kB
- hubconf.py2 kB
- src