reproducibilityindex.ai

Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving

Authors: Qiuyu Ding, Hailong Cao, Tiejun Zhao

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On a benchmark dataset of BLI, our proposed method achieves competitive performance compared to existing state-of-the-art (SOTA) methods. It demonstrates effectiveness and robustness across six experimental languages, including similar language pairs and distant language pairs, under both supervised and unsupervised settings. ... To evaluate the effectiveness of our method, we perform a comprehensive set of BLI experiments on the standard BLI benchmark
Researcher Affiliation	Academia	Qiuyu Ding, Hailong Cao*, Tiejun Zhao Harbin Institute of Technology qiuyuding@stu.hit.edu.cn, caohailong@hit.edu.cn, tjzhao@hit.edu.cn
Pseudocode	No	The paper describes the method using text and a diagram, but it does not include a dedicated pseudocode block or algorithm listing.
Open Source Code	No	The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We use the widely used MUSE dataset (Lample et al. 2018), which consists of 300-dim embeddings pre-trained with Fast Text (Bojanowski et al. 2017) which is trained on the monolingual corpora of full Wikipedias for each language, and the vocabularies are trimmed to the 200k most frequent words. We also employ the test sets released by (Lample et al. 2018) that are widely used in BLI evaluations.
Dataset Splits	No	The paper explicitly mentions using '5k translation pairs are used as seed lexicon D0' for training and 'test sets' for evaluation, but it does not specify a separate 'validation' split or its details for reproduction.
Hardware Specification	Yes	All experiments are performed on a single Nvidia RTX A6000.
Software Dependencies	No	The paper mentions general software like 'Fast Text' and several baseline BLI systems, but it does not provide specific version numbers for software dependencies used in its own implementation.
Experiment Setup	Yes	We select best hyperparameters by searching a combination of λ, n, m with the following range: λ: {0.05, 0.1, . . . , 1.0} with 0.05 step size; n, m: {3, 4, . . . , 20} with 1 step size.