Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving
Authors: Qiuyu Ding, Hailong Cao, Tiejun Zhao
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a benchmark dataset of BLI, our proposed method achieves competitive performance compared to existing state-of-the-art (SOTA) methods. It demonstrates effectiveness and robustness across six experimental languages, including similar language pairs and distant language pairs, under both supervised and unsupervised settings. ... To evaluate the effectiveness of our method, we perform a comprehensive set of BLI experiments on the standard BLI benchmark |
| Researcher Affiliation | Academia | Qiuyu Ding, Hailong Cao*, Tiejun Zhao Harbin Institute of Technology qiuyuding@stu.hit.edu.cn, caohailong@hit.edu.cn, tjzhao@hit.edu.cn |
| Pseudocode | No | The paper describes the method using text and a diagram, but it does not include a dedicated pseudocode block or algorithm listing. |
| Open Source Code | No | The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use the widely used MUSE dataset (Lample et al. 2018), which consists of 300-dim embeddings pre-trained with Fast Text (Bojanowski et al. 2017) which is trained on the monolingual corpora of full Wikipedias for each language, and the vocabularies are trimmed to the 200k most frequent words. We also employ the test sets released by (Lample et al. 2018) that are widely used in BLI evaluations. |
| Dataset Splits | No | The paper explicitly mentions using '5k translation pairs are used as seed lexicon D0' for training and 'test sets' for evaluation, but it does not specify a separate 'validation' split or its details for reproduction. |
| Hardware Specification | Yes | All experiments are performed on a single Nvidia RTX A6000. |
| Software Dependencies | No | The paper mentions general software like 'Fast Text' and several baseline BLI systems, but it does not provide specific version numbers for software dependencies used in its own implementation. |
| Experiment Setup | Yes | We select best hyperparameters by searching a combination of λ, n, m with the following range: λ: {0.05, 0.1, . . . , 1.0} with 0.05 step size; n, m: {3, 4, . . . , 20} with 1 step size. |