Unsupervised Multilingual Alignment using Wasserstein Barycenter
Authors: Xin Lian, Kshitij Jain, Jakub Truszkowski, Pascal Poupart, Yaoliang Yu
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on standard benchmarks and demonstrate state-of-the-art performances. We conduct extensive experiments on standard publicly available benchmark datasets and demonstrate competitive performance against current state-of-the-art alternatives. We evaluate our algorithm on two standard publicly available datasets: MUSE [Lample et al., 2018] and XLING [Glavas et al., 2019]. Table 2 depicts precision@1 results for all bilingual tasks on the MUSE benchmark [Lample et al., 2018]. Table 3 shows mean average precision (MAP) for 10 bilingual tasks on the XLING dataset [Glavas et al., 2019]. In this section, we show the impact of some of our design choices and hyperparameters. One of the parameters is the number of support locations. In Figure 2, we show the impact on translation performance when we have a different number of support locations. |
| Researcher Affiliation | Collaboration | Xin Lian1,3 , Kshitij Jain2 , Jakub Truszkowski2 , Pascal Poupart1,2,3 and Yaoliang Yu1,3 1University of Waterloo, Waterloo, Canada 2Borealis AI, Waterloo, Canada 3Vector Institute, Toronto, Canada |
| Pseudocode | Yes | Algorithm 1: Barycenter Alignment |
| Open Source Code | No | Method and code for computing accuracies of bilingual translation pairs are borrowed from Alvarez-Melis and Jaakkola [2018]. |
| Open Datasets | Yes | We evaluate our algorithm on two standard publicly available datasets: MUSE [Lample et al., 2018] and XLING [Glavas et al., 2019]. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits needed to reproduce the experiment. While it mentions using standard datasets and reports precision, it does not specify the exact percentages or counts for data partitioning. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or cloud computing instance specifications used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch or TensorFlow, or specific solver versions). |
| Experiment Setup | Yes | To speed up the computation, we took a similar approach as Alaux et al. [2019] and initialized space alignment matrices with the Gromov-Wasserstein approach [Alvarez-Melis and Jaakkola, 2018] applied to the first 5k vectors ( Alaux et al. [2019] used the first 2k vectors) and with regularization parameter ϵ of 5e 5. The support locations for the barycenter are initialized with random samples from a standard normal distribution. Therefore, in an effort to balance accuracy and computational complexity, we decided to use 10000 support locations (twice the average number of words). |