Show simple item record

 
dc.contributor.author Arnardóttir, Þórunn
dc.contributor.author Ingason, Anton Karl
dc.date.accessioned 2020-04-22T12:51:30Z
dc.date.available 2020-04-22T12:51:30Z
dc.date.issued 2020-04-22
dc.identifier.uri http://hdl.handle.net/20.500.12537/17
dc.description The Icelandic Neural Parsing Pipeline (IceNeuralParsingPipeline) includes all steps necessary for parsing plain Icelandic text, i.e. preprocessing, parsing and post processing. The preprocessing step consists of tokenization, both punctuation and matrix clause splitting. The parsing step consists of an Icelandic model of the Berkeley Neural Parser, trained on IcePaHC, which reports an 84.74 F1 score. The output's annotation scheme is the same as IcePaHC's, except that neither empty phrases, e.g. traces and zero subjects, nor lemmas are shown. The post processing step includes minor steps for cleaning and formatting the parsed text.
dc.description Íslenska taugaþáttunarpípan (IceNeuralParsingPipeline) er þáttunarpípa sem inniheldur öll skref sem eru nauðsynleg til að þátta hreinan íslenskan texta, þ.e. skref fyrir forvinnslu, þáttun og eftirvinnslu texta. Forvinnsluskrefið samanstendur af tókun, bæði eftir greinarmerkjum og aðalsetningum. Þáttunarskrefið inniheldur íslenskt líkan af Berkeley-taugaþáttaranum sem var þjálfað á IcePaHC-trjábankanum og skilar 84,74% f-mælingu. Þáttunarskema úttaksins er líkt og skema IcePaHC, en hvorki tómir liðir, þ.e. spor eða núllfrumlög, né uppflettimyndir eru sýndar. Eftirvinnsluskrefið inniheldur minniháttar skref til að hreinsa og breyta sniði þáttaða textans.
dc.language.iso isl
dc.publisher Háskóli Íslands
dc.rights The MIT License (MIT)
dc.rights.uri https://opensource.org/licenses/mit-license.php
dc.rights.label PUB
dc.source.uri https://github.com/antonkarl/iceParsingPipeline
dc.subject parsing
dc.subject neural parsing
dc.subject parsing pipelines
dc.subject berkeley neural parser
dc.title IceNeuralParsingPipeline 20.04
dc.type toolService
metashare.ResourceInfo#ContentInfo.detailedType tool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
has.files yes
branding Clarin IS Repository
contact.person Þórunn Arnardóttir tha86@hi.is Háskóli Íslands
files.size 1483935083
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
The MIT License (MIT)
Icon
Name
IceNeuralParsingPipeline.zip
Size
1.38 GB
Format
application/zip
Description
zip file containing the parsing pipeline
MD5
f40c4d6469a14648aabec9b786d0bfc3
 Download file  Preview
 File Preview  
  • IceNeuralParsingPipeline
    • LICENSE-1 B
    • README.md-1 B
    • .DS_Store-1 B
    • demoOutput.psd-1 B
    • runallNeural.sh-1 B
    • tools
      • scripts
        • postprocess.py~-1 B
        • postprocess.py-1 B
        • preprocess.py~-1 B
        • preprocess.py-1 B
        • postprocessNeural.sh-1 B
      • cs
        • CS_2.002.75.jar-1 B
        • formatpsd.sh-1 B
        • donothing.q-1 B
        • formatpsd.sh~-1 B
        • donothing.q~-1 B
      • .DS_Store-1 B
      • neuralParser
        • _dev=84.91.pt-1 B
        • src
          • chart_helper.pyx-1 B
          • evaluate.py-1 B
          • __pycache__
            • parse_nk.cpython-36.pyc-1 B
            • nkutil.cpython-38.pyc-1 B
            • evaluate.cpython-38.pyc-1 B
            • vocabulary.cpython-38.pyc-1 B
            • trees.cpython-38.pyc-1 B
            • evaluate.cpython-36.pyc-1 B
            • nkutil.cpython-36.pyc-1 B
            • vocabulary.cpython-36.pyc-1 B
            • trees.cpython-36.pyc-1 B
            • parse_nk.cpython-38.pyc-1 B
          • parse_nk.py-1 B
          • vocabulary.py-1 B
          • main.py-1 B
          • trees.py-1 B
          • transliterate.py-1 B
          • nkutil.py-1 B
          • viz.py-1 B
      • splitter
        • wordtags.tsv-1 B
        • icepunct.gz-1 B
        • iceconj.gz-1 B
        • __pycache__
          • splitter.cpython-36.pyc-1 B
        • splitter.py-1 B
    • demoTextOutput.txt-1 B
    • demoinput.txt-1 B
  • __MACOSX
    • IceNeuralParsingPipeline
      • ._README.md-1 B
      • ._.DS_Store-1 B
      • ._demoOutput.psd-1 B
      • ._demoTextOutput.txt-1 B
      • ._runallNeural.sh-1 B
      • tools
        • scripts
          • ._postprocess.py~-1 B
          • ._preprocess.py~-1 B
          • ._preprocess.py-1 B
          • ._postprocessNeural.sh-1 B
          • ._postprocess.py-1 B
        • ._.DS_Store-1 B
        • cs
          • ._donothing.q~-1 B
          • ._formatpsd.sh-1 B
          • ._formatpsd.sh~-1 B
          • ._donothing.q-1 B
          • ._CS_2.002.75.jar-1 B
        • neuralParser
          • src
            • ._nkutil.py-1 B
            • __pycache__
              • ._trees.cpython-38.pyc-1 B
              • ._parse_nk.cpython-36.pyc-1 B
              • ._evaluate.cpython-38.pyc-1 B
              • ._nkutil.cpython-36.pyc-1 B
              • ._trees.cpython-36.pyc-1 B
              • ._evaluate.cpython-36.pyc-1 B
              • ._vocabulary.cpython-38.pyc-1 B
              • ._vocabulary.cpython-36.pyc-1 B
              • ._parse_nk.cpython-38.pyc-1 B
              • ._nkutil.cpython-38.pyc-1 B
            • ._viz.py-1 B
            • ._vocabulary.py-1 B
            • ._evaluate.py-1 B
            • ._main.py-1 B
            • ._transliterate.py-1 B
            • ._chart_helper.pyx-1 B
            • ._parse_nk.py-1 B
            • ._trees.py-1 B
          • .__dev=84.91.pt-1 B
        • splitter
          • ._icepunct.gz-1 B
          • ._wordtags.tsv-1 B
          • ._splitter.py-1 B
          • ._iceconj.gz-1 B
          • __pycache__
            • ._splitter.cpython-36.pyc-1 B
      • ._demoinput.txt-1 B
      • ._LICENSE-1 B

Show simple item record