On Learning Language-Invariant Representations for Universal Machine Translation
Authors: Han Zhao, Junjie Hu, Andrej Risteski
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we formally prove certain impossibilities of this endeavour in general, as well as prove positive results in the presence of additional (but natural) structure of data. For the former, we derive a lower bound on the translation error in the many-to-many translation setting, which shows that any algorithm aiming to learn shared sentence representations among multiple language pairs has to make a large translation error on at least one of the translation tasks, if no assumption on the structure of the languages is made. For the latter, we show that if the paired documents in the corpus follow a natural encoderdecoder generative process, we can expect a natural notion of generalization : a linear number of language pairs, rather than quadratic, suffices to learn a good representation. |
| Researcher Affiliation | Academia | 1Carnegie Mellon University, Pittsburgh, USA. Correspondence to: Han Zhao <han.zhao@cs.cmu.edu>. |
| Pseudocode | No | The paper is theoretical and does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper focuses on theoretical analysis and does not mention providing or releasing any source code for a methodology. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments involving datasets. It discusses 'distributions over documents' and 'parallel corpora' in a conceptual context, not as specific, publicly available datasets used for training or evaluation with concrete access information. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments, so it does not describe dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe any experimental setup or software implementation details, therefore no software dependencies are specified. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup, hyperparameters, or training configurations. |