reproducibilityindex.ai

Multilingual Alignment of Contextual Word Representations

Authors: Steven Cao, Nikita Kitaev, Dan Klein

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose procedures for evaluating and strengthening contextual embedding alignment and show that they are useful in analyzing and improving multilingual BERT. In particular, after our proposed alignment procedure, BERT exhibits signiﬁcantly improved zero-shot performance on XNLI compared to the base model, remarkably matching pseudo-fully-supervised translate-train models for Bulgarian and Greek. Further, to measure the degree of alignment, we introduce a contextual version of word retrieval and show that it correlates well with downstream zero-shot transfer. Using this word retrieval task, we also analyze BERT and ﬁnd that it exhibits systematic deﬁciencies, e.g. worse alignment for open-class parts-of-speech and word pairs written in different scripts, that are corrected by the alignment procedure.
Researcher Affiliation	Academia	Steven Cao, Nikita Kitaev & Dan Klein Computer Science Division University of California, Berkeley {stevencao,kitaev,klein}@berkeley.edu
Pseudocode	No	The paper does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	As our dataset, we use the Europarl corpora for English paired with Bulgarian, German, Greek, Spanish, and French... We use the most recent 1024 sentences as the test set, the previous 1024 sentences as the development set, and the following 250K sentences as the training set. ...we also report numbers for 10K and 50K parallel sentences.
Dataset Splits	Yes	We use the most recent 1024 sentences as the test set, the previous 1024 sentences as the development set, and the following 250K sentences as the training set.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models) used to run the experiments.
Software Dependencies	No	The paper mentions software like "fast Align", "polyglot", and "spaCy" that were used, but it does not specify any version numbers for these or other key software components.
Experiment Setup	Yes	For both alignment and XNLI optimization, we use a learning rate of 5 10 5 with Adam hyperparameters β = (0.9, 0.98), ϵ = 10 9 and linear learning rate warmup for the ﬁrst 10% of the training data. For alignment, the model is trained for one epoch, with each batch containing 2 sentence pairs per language. For XNLI, each model is trained for 3 epochs with 32 examples per batch, and 10% dropout is applied to the BERT embeddings.