Sports Terminology 26.03 http://hdl.handle.net/20.500.12537/380 Publisher: The Árni Magnússon Institute for Icelandic Studies Authors: Einar Freyr Sigurðsson, Magnús Már Magnússon, Atli Jasonarson, Steinþór Steingrímsson Published under CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/) This dataset contains terminology from basketball, chess, football, golf and gymnastics. The vocabulary found here originates in a test suite at WMT25 (Conference on Machine Translation) where sports segments were translated from English to Icelandic. The data is published in TBX-format, containing an English term, with Icelandic translations and part-of-speech (PoS). There is a separate file for each subdomain. The dataset contains a total of 538 entries: basketball.tbx with 160 entries. chess.tbx with 77 entries. football.tbx with 204 entries. golf.tbx with 52 entries. gymnastics.tbx with 45 entries. Some terms, such as "win" or "award", may not be specific to the sport involved but we include them anyway as it may be helpful for e.g. translating a text. Furthermore, some terms are found under a single subdomain only, such as "press conference", even though they are not specific to that sport; a press conference can be held in relation to any sport. However, "press conference" was only used in a basketball text segment translated in the test suite for WMT25. In cases where a syntactic phrase is involved with more than one word, like a noun phrase ("free throw"), a verb phrase ("exchange pawns") or prepositional phrase ("in play"), we use the head of the phrase when listing the PoS (noun, verb and preposition in these cases, respectively). The PoS can sometimes reflect how the word is used rather than how it would be marked in isolation. Thus, "international" is clearly an adjective in isolation (like "an international player") but based on a sentence like "This German international has played five games" we mark its PoS as a noun (cf. also the fact that it is translated as a noun in Icelandic: "landsliðsmaður, landsliðskona"). Within the chess vocabulary, there are English terms such as R[a-h][1-8], translated as H[a-h][1-8]. We mark their PoS as "phrase". These represent pieces on the chess board, such as "Qd1" ("Dd1" in Icelandic), meaning that the queen is located on the square d1. [1-8] can be any of the ranks from 1 to 8 and [a-h] any of a, b, c, d, e, f, g, h on the chess board. We thank Árni Jóhannsson, Eiríkur Stefán Ásgeirsson and Jóhannes Gísli Jónsson for answering questions on specific terms relating to basketball, golf and chess, respectively.