reproducibilityindex.ai

Normalization of Language Embeddings for Cross-Lingual Alignment

Authors: Prince Osei Aboagye, Yan Zheng, Chin-Chia Michael Yeh, Junpeng Wang, Wei Zhang, Liang Wang, Hao Yang, Jeff Phillips

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that meaning is retained and alignment is improved on similarity, translation, and cross-language classification tasks. We provide an evaluation of our proposed preprocessing methods using eight (8) language embeddings pre-trained on Wikipedia (Bojanowski et al., 2017) of each language: Croatian (HR), English (EN), Finnish (FI), French (FR), German (DE), Italian (IT), Russian (RU), and Turkish (TR).
Researcher Affiliation	Collaboration	Prince Osei Aboagye1, Yan Zheng2, Chin-Chia Michael Yeh2, Junpeng Wang2, Wei Zhang2, Liang Wang2, Hao Yang2, Jeff M. Phillips1 1University of Utah, 2Visa Research 1{prince ,jeffp}@cs.utah.edu 2{yazheng,miyeh,junpenwa,wzhan,liawang,haoyang}@visa.com
Pseudocode	Yes	Algorithm 1 Spectral Normalization (Spec Norm(A, β))
Open Source Code	Yes	We provide code at https://github.com/poaboagye/Spec Norm. Our new code for Spec Norm is in Appendix H and here https://github.com/poaboagye/Spec Norm.
Open Datasets	Yes	We provide an evaluation of our proposed preprocessing methods using eight (8) language embeddings pre-trained on Wikipedia (Bojanowski et al., 2017)
Dataset Splits	Yes	We trained (aligned) using 1k, 3k and 5k source words and evaluated (tested) on separate 2k source test queries, unless noted otherwise. The Procrustes alignment algorithm was trained on 5k source words and evaluated on 1.5k source test queries.
Hardware Specification	Yes	Hardware specifications are NVIDIA Ge Force GTX Titan Xp 12GB, AMD Ryzen 7 1700 eight-core processor, and 62.8GB RAM.
Software Dependencies	No	The paper mentions `numpy` and `argparse` in the provided code snippet but does not specify their versions. No other software dependencies are listed with specific version numbers.
Experiment Setup	Yes	The hyperparameters β {1, 2, 3, 4, 5} and #Iter (number of iterations) {1, 2, 3, 4, 5} were fine-tuned for I-C+SN+L. So hereafter, we applying I-C+SN+L with the hyperparameter (β, #Iter) = (2, 5).