On Learning Language-Invariant Representations for Universal Machine Translation

Authors: Han Zhao, Junjie Hu, Andrej Risteski

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we formally prove certain impossibilities of this endeavour in general, as well as prove positive results in the presence of additional (but natural) structure of data. For the former, we derive a lower bound on the translation error in the many-to-many translation setting, which shows that any algorithm aiming to learn shared sentence representations among multiple language pairs has to make a large translation error on at least one of the translation tasks, if no assumption on the structure of the languages is made. For the latter, we show that if the paired documents in the corpus follow a natural encoderdecoder generative process, we can expect a natural notion of generalization : a linear number of language pairs, rather than quadratic, suffices to learn a good representation.
Researcher Affiliation Academia 1Carnegie Mellon University, Pittsburgh, USA. Correspondence to: Han Zhao <han.zhao@cs.cmu.edu>.
Pseudocode No The paper is theoretical and does not include any pseudocode or algorithm blocks.
Open Source Code No The paper focuses on theoretical analysis and does not mention providing or releasing any source code for a methodology.
Open Datasets No The paper is theoretical and does not conduct experiments involving datasets. It discusses 'distributions over documents' and 'parallel corpora' in a conceptual context, not as specific, publicly available datasets used for training or evaluation with concrete access information.
Dataset Splits No The paper is theoretical and does not conduct experiments, so it does not describe dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe any experimental setup, therefore no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe any experimental setup or software implementation details, therefore no software dependencies are specified.
Experiment Setup No The paper is theoretical and does not describe any experimental setup, hyperparameters, or training configurations.