reproducibilityindex.ai

Word translation without parallel data

Authors: Guillaume Lample, Alexis Conneau, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that our method works very well also for distant language pairs... In this section, we empirically demonstrate the effectiveness of our unsupervised approach on several benchmarks, and compare it with state-of-the-art supervised methods.
Researcher Affiliation	Collaboration	{glample,aconneau,ranzato,rvj}@fb.com ludovic.denoyer@upmc.fr Equal contribution. Order has been determined with a coin ﬂip. Facebook AI Research Sorbonne Universit es, UPMC Univ Paris 06, UMR 7606, LIP6 LIUM, University of Le Mans
Pseudocode	No	The paper describes the algorithmic steps and equations within the main text, but it does not include any clearly labeled pseudocode blocks or algorithm listings.
Open Source Code	Yes	Our code, embeddings and dictionaries are publicly available1. 1https://github.com/facebookresearch/MUSE
Open Datasets	Yes	We use unsupervised word vectors that were trained using fast Text2. These correspond to monolingual embeddings of dimension 300 trained on Wikipedia corpora... We make these dictionaries publicly available as part of the MUSE library3... We use the Sem Eval 2017 competition data (Camacho-Collados et al. (2017))... Europarl corpus.
Dataset Splits	Yes	We divide the learning rate by 2 every time our unsupervised validation criterion decreases. We use it as a stopping criterion during training, and also for hyperparameter selection in all our experiments. Speciﬁcally, we consider the 10k most frequent source words, and use CSLS to generate a translation for each of them. We then compute the average cosine similarity between these deemed translations, and use this average as a validation metric.
Hardware Specification	No	The paper does not specify the exact hardware components (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'fast Text' as the tool used to train word vectors, and references 'deep adversarial networks of Goodfellow et al. (2014)', but it does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For our discriminator, we use a multilayer perceptron with two hidden layers of size 2048, and Leaky-Re LU activation functions. The input to the discriminator is corrupted with dropout noise with a rate of 0.1. We use stochastic gradient descent with a batch size of 32, a learning rate of 0.1 and a decay of 0.95 both for the discriminator and W. We divide the learning rate by 2 every time our unsupervised validation criterion decreases.