reproducibilityindex.ai

Comparing Unsupervised Word Translation Methods Step by Step

Authors: Mareike Hartmann, Yova Kementchedjhieva, Anders Søgaard

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we focus on aligning word vector spaces between two languages, by projecting from the foreign language into English. Our languages are: Estonian (et), Farsi (fa), Finnish (fi), Latvian (lv), Turkish (tr), and Vietnamese (vi).
Researcher Affiliation	Academia	Mareike Hartmann Department of Computer Science University of Copenhagen Copenhagen, Denmark hartmann@di.ku.dk; Yova Kementchedjhieva Department of Computer Science University of Copenhagen Copenhagen, Denmark yova@di.ku.dk; Anders Søgaard Department of Computer Science University of Copenhagen Copenhagen, Denmark soegaard@di.ku.dk
Pseudocode	No	The paper describes the GAN algorithm and other methods in textual paragraphs, but it does not include any clearly labeled pseudocode blocks or algorithm listings.
Open Source Code	No	The paper refers to using the MUSE system (https://github.com/facebookresearch/MUSE) and the VecMap framework (https://github.com/artetxem/vecmap), which are external tools, but does not state that its own code for the methodology is open-sourced or provide a link to it.
Open Datasets	Yes	In all our experiments, we use pretrained Fast Text embeddings (Bojanowski et al., 2017) and the bilingual test dictionaries released along with the MUSE system.2 The Fast Text embeddings are trained on Wikipedia dumps3; the bilingual dictionaries were created using an in-house Facebook translation tool and contain translations for 1500 test words for each language pair. 2https://github.com/facebookresearch/MUSE 3https://fasttext.cc/docs/en/pretrained-vectors.html
Dataset Splits	Yes	tuning the learning rate and the gradient penalty λ using nearest neighbor cosine distance as validation criterion. On the other hand, the results were not signiﬁcantly better, and instability did not improve. Finally, we experimented with CT-GANs (Wei et al., 2018), an extension of Wasserstein GANs with gradient penalty, but this only lowered performance and increased instability. Since Wasserstein GANs and CT-GANs were consistently worse and less stable than vanilla GANs, we do not include them in the experiments below.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies	No	The paper mentions using the 'MUSE code' and 'Vec Map framework' but does not specify their version numbers or any other software dependencies with specific versions.
Experiment Setup	Yes	Since we cannot do reliable hyper-parameter optimization in the absence of cross-lingual supervision, we use MUSE with the default parameters (Conneau et al., 2018). For the experiments with stochastic dictionary induction (Table 3), we use the implementation in the Vec Map framework (Artetxe et al., 2018).4