Files in this item
Download all files in item (16.36 KB)This item is
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution 4.0 International (CC BY 4.0)
- Name
- sports_terminology_26.03.zip
- Size
- 13.79 KB
- Format
- application/zip
- Description
- Unknown
- MD5
- c2b8f30f90766c7940f3dfe1a480cf78
- Name
- README.txt
- Size
- 2.56 KB
- Format
- Text file
- Description
- Unknown
- MD5
- 807e34dac94f3c5ebeccdcbb8454430c
Sports Terminology 26.03
http://hdl.handle.net/20.500.12537/380
Publisher: The Árni Magnússon Institute for Icelandic Studies
Authors: Einar Freyr Sigurðsson, Magnús Már Magnússon, Atli Jasonarson, Steinþór Steingrímsson
Published under CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
This dataset contains terminology from basketball, chess, football, golf and gymnastics. The vocabulary found here originates in a test suite at WMT25 (Conference on Machine Translation) where sports segments were translated from English to Icelandic. The data is published in TBX-format, containing an English term, with Icelandic translations and part-of-speech (PoS). There is a separate file for each subdomain. The dataset contains a total of 538 entries:
basketball.tbx with 160 entries.
chess.tbx with 77 entries.
football.tbx with 204 entries.
golf.tbx with 52 entries.
gymnastics.tbx with 45 entries.
Some terms, such as "win" or "award", may not be specific to the sport involved but we i . . .