Sýna einfalda færslu atriðis
dc.contributor.author |
Arnardóttir, Þórunn |
dc.contributor.author |
Ingason, Anton Karl |
dc.date.accessioned |
2020-04-22T12:51:30Z |
dc.date.available |
2020-04-22T12:51:30Z |
dc.date.issued |
2020-04-22 |
dc.identifier.uri |
http://hdl.handle.net/20.500.12537/17 |
dc.description |
The Icelandic Neural Parsing Pipeline (IceNeuralParsingPipeline) includes all steps necessary for parsing plain Icelandic text, i.e. preprocessing, parsing and post processing. The preprocessing step consists of tokenization, both punctuation and matrix clause splitting. The parsing step consists of an Icelandic model of the Berkeley Neural Parser, trained on IcePaHC, which reports an 84.74 F1 score. The output's annotation scheme is the same as IcePaHC's, except that neither empty phrases, e.g. traces and zero subjects, nor lemmas are shown. The post processing step includes minor steps for cleaning and formatting the parsed text. |
dc.description |
Íslenska taugaþáttunarpípan (IceNeuralParsingPipeline) er þáttunarpípa sem inniheldur öll skref sem eru nauðsynleg til að þátta hreinan íslenskan texta, þ.e. skref fyrir forvinnslu, þáttun og eftirvinnslu texta. Forvinnsluskrefið samanstendur af tókun, bæði eftir greinarmerkjum og aðalsetningum. Þáttunarskrefið inniheldur íslenskt líkan af Berkeley-taugaþáttaranum sem var þjálfað á IcePaHC-trjábankanum og skilar 84,74% f-mælingu. Þáttunarskema úttaksins er líkt og skema IcePaHC, en hvorki tómir liðir, þ.e. spor eða núllfrumlög, né uppflettimyndir eru sýndar. Eftirvinnsluskrefið inniheldur minniháttar skref til að hreinsa og breyta sniði þáttaða textans. |
dc.language.iso |
isl |
dc.publisher |
Háskóli Íslands |
dc.rights |
The MIT License (MIT) |
dc.rights.uri |
https://opensource.org/licenses/mit-license.php |
dc.rights.label |
PUB |
dc.source.uri |
https://github.com/antonkarl/iceParsingPipeline |
dc.subject |
parsing |
dc.subject |
neural parsing |
dc.subject |
parsing pipelines |
dc.subject |
berkeley neural parser |
dc.title |
IceNeuralParsingPipeline 20.04 |
dc.type |
toolService |
metashare.ResourceInfo#ContentInfo.detailedType |
tool |
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent |
true |
has.files |
yes |
branding |
Clarin IS Repository |
contact.person |
Þórunn Arnardóttir tha86@hi.is Háskóli Íslands |
files.size |
1483935083 |
files.count |
1 |
Files in this item
This item is
Publicly Available
and licensed under:
The MIT License (MIT)
- Name
- IceNeuralParsingPipeline.zip
- Size
- 1.38
GB
- Format
- application/zip
- Description
- zip file containing the parsing pipeline
- MD5
- f40c4d6469a14648aabec9b786d0bfc3
Download file
Preview
Sýna einfalda færslu atriðis