Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Enhancing Bilingual Lexicon Induction via Bi-directional Translation Pair Retrieving
Authors: Qiuyu Ding, Hailong Cao, Tiejun Zhao
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a benchmark dataset of BLI, our proposed method achieves competitive performance compared to existing state-of-the-art (SOTA) methods. It demonstrates effectiveness and robustness across six experimental languages, including similar language pairs and distant language pairs, under both supervised and unsupervised settings. ... To evaluate the effectiveness of our method, we perform a comprehensive set of BLI experiments on the standard BLI benchmark |
| Researcher Affiliation | Academia | Qiuyu Ding, Hailong Cao*, Tiejun Zhao Harbin Institute of Technology EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes the method using text and a diagram, but it does not include a dedicated pseudocode block or algorithm listing. |
| Open Source Code | No | The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use the widely used MUSE dataset (Lample et al. 2018), which consists of 300-dim embeddings pre-trained with Fast Text (Bojanowski et al. 2017) which is trained on the monolingual corpora of full Wikipedias for each language, and the vocabularies are trimmed to the 200k most frequent words. We also employ the test sets released by (Lample et al. 2018) that are widely used in BLI evaluations. |
| Dataset Splits | No | The paper explicitly mentions using '5k translation pairs are used as seed lexicon D0' for training and 'test sets' for evaluation, but it does not specify a separate 'validation' split or its details for reproduction. |
| Hardware Specification | Yes | All experiments are performed on a single Nvidia RTX A6000. |
| Software Dependencies | No | The paper mentions general software like 'Fast Text' and several baseline BLI systems, but it does not provide specific version numbers for software dependencies used in its own implementation. |
| Experiment Setup | Yes | We select best hyperparameters by searching a combination of λ, n, m with the following range: λ: {0.05, 0.1, . . . , 1.0} with 0.05 step size; n, m: {3, 4, . . . , 20} with 1 step size. |