Automatically Creating a Large Number of New Bilingual Dictionaries

Authors: Khang Lam, Feras Al Tarouti, Jugal Kalita

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results; Data sets used; Results and human evaluation; Table 1; Table 2; Table 3; Table 4; Table 5; Table 6
Researcher Affiliation Academia Khang Nhut Lam and Feras Al Tarouti and Jugal Kalita Computer Science Department University of Colorado, USA {klam2, faltarou, jkalita}@uccs.edu
Pseudocode Yes Algorithm 1 DT algorithm; Algorithm 2 IW algorithm; Algorithm 3 Find Candidate Set (Offset-POSs,D)
Open Source Code No The paper mentions using publicly available Wordnets and a machine translator, but it does not state that the source code for their own proposed methodology is open-source or publicly available.
Open Datasets Yes We work with 5 existing bilingual dictionaries that translate a given language to a resource-rich language, which happens to be eng in our experiments: Dict(arb,eng) and Dict(vie,eng) supported by Panlex5; Dict(ajz,eng) and Dict(dis,eng) supported by Xobdo6; one Dict(asm,eng) created by integrating two dictionaries Dict(asm,eng) provided by Panlex and Xobdo. ... To solve the problem of ambiguities, we use the PWN and Wordnets in several other languages linked to the PWN provided by the Open Multilingual Wordnet project (Bond and Foster 2013): Finn Wordnet (Linden and Carlson 2010) (FWN), WOLF (Sagot and Fiser 2008) (WWN) and Japanese Wordnet (Isahara et al. 2008) (JWN).
Dataset Splits No The paper does not describe traditional train/validation splits for a machine learning model's training process on an input dataset. It focuses on the generation of dictionaries and their subsequent human evaluation.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments.
Software Dependencies No The Microsoft Translator Java API8 is used as another main resource. (No specific version number for the API is mentioned)
Experiment Setup Yes In these tables, Top n means dictionaries created by picking only translations with the top n highest ranks for each word, A: dictionaries created using PWN only; B: using PWN and FWN; C: using PWN, FWN and JWN; D: using PWN, FWN, JWN and WWN.